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Annotation decoding 
Patent Assignee: DING 3 (DlNG-l) 
inventor: DING D 

Patent Family (1 patents, 1 countries) 
Patent Application 

Number Kind Date Number Kind Date update 

CN 101038585 A 20070919 CN 200610070821 A 20060313 200810 B 

Priority Applications (no., kind, date): CN 200610070821 A 20060313 

Patent Details 

Number Kind Lan Pg Dwg Filing Notes 

CN 101038585 A ZH 1 
CN A 

NOVELTY - Annotation decoding universally labeled or substituted from 
letters of different languages of countries all over the world is a simple, 
practical, understandable and universally accepted digital transforming 
code with which language characters or letters of countries all over the 
world can be substituted and different language letters of characters of 
countries all over the world can be universally labeled and expressed. It 
is comprised of ten Arabic number identifiers, arranged digital code in 
alphabetic order, Chinese, combined character, spelling rule and phonetic 
sign, if language character / letter of each country can set up letter, 
term, sentence library by the use of the code, which is favor for the 
digital code of the language letter / character of each country , the 
letter, term , sentence can be found with the help of the voice of the 
annotation decoding and can be directly transformed by the use of the 
computer. Therefore, the invention is an aide for international exchange. 



Title Terms/index Terms/Additional words: decode 
Class Codes 

international Classification (+ Attributes) 
IPC + Level value Position Status Version 

G06F-0017/28 A I F 20060101 
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...the code, which is favor for the digital code of the language letter / 
character of each country , the letter, term , sentence can be found 
with the help of the voice of the annotation decoding and can be directly 
transformed . . . 

Original Publication Data by Authority 



Original Abstracts: 

...the code, which is favor for the digital code of the language letter / 
character of each country , the letter, term , sentence can be found 
with the help of the voice of the annotation decoding and can be directly 
transformed . . . 
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Computer implemented document summarizing method for authors and readers, 
involves constructing and inserting sentence based summary of document's 
writings at beginning of document 

Patent Assignee: COKUS S 3 (COKU-l) ; DOLAN w B (DOLA-l) ; FEIN R A 
(FEIN-I) ; FRIES E 3 (FRIE-l) ; MESSERLY 3 (MESS-l) ; MICROSOFT CORP 
(MICT) ; THORPE C A (THOR-l) 

inventor: COKUS S 3; DOLAN W B; FEIN R A; FRIES E 3\ MESSERLY 3\ THORPE C A 
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Alerting Abstract us Al 

NOVELTY - A sentence based summary of a document's writings is 
constructed and inserted at the beginning of the document . 
DESCRIPTION - INDEPENDENT CLAIMS are included for the following: 

1. word processing application; 

2. Electronic mail application; 

3. internet web browser application; 

4. Computer summarizing document; and 

5. Document file. 

USE - For summarizing documents helpful in assisting authors and readers. 

ADVANTAGE - Enables authors to automatically create summaries of their 
writings with improved quality, in a convenient and useful way to the 
author . 

Title Terms/index Terms/Additional words: computer; implement; document; 
summary; method; read; construction; insert; sentence; based; begin 
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...authors and readers, involves constructing and inserting sentence based 

summary of document's writings at beginning of document 

...A sentence based summary of a document's writings is constructed and 
inserted at the beginning of the document . 



Original Publication Data by Authority 



Original Abstracts: 

. . . summarizer performs a statistical analysis to generate a list of ranked 
sentences for consideration in the summary. The summarizer counts how 
frequently content words appear in a document and produces a table 
correlating the content words with their corresponding frequency counts. 
Phrase compression techniques... 

...inclusion of the sentence have been satisfied. The summarizer then 
inserts the sentence at the beginning of the document before the 
start of the text. . . 

...analysis to generate a list of ranked sentences for consideration in the 
summary. The summarizer counts how frequently content words appear 
in a document and produces a table correlating the content words with their 
corresponding frequency... 

...inclusion of the sentence have been satisfied. The summarizer then 
inserts the sentence at the beginning of the document before the start 
of the text. > 
Claims: 

...based summary of a document's writings; andinserting the sentence-based 
summary at a beginning of the document . 



...frequently words appear in the document, a computer-implemented method 
comprising: evaluating words in the document to identify ordered sets 
of words that appear repeatedly in a same order ; ranki ng individual 
sentences in the document by treating the ordered sets of words as 
if they were single words ; generati ng the summary based at least in part on 
the sentence rankings; inserting the summary into a file comprising the 
document ; andsaving the file to non-volatile data storage. 
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Computer-based document summarization method for word processor, involves 
performing cue-phrase analysis by comparing words and phrases in specified 
sentences with pre-compiled list 

Patent Assignee: COKUS S 3 (COKU-l) ; DOLAN w B (DOLA-l) ; FEIN R A 
(FEIN-I) ; FRIES E 3 (FRIE-l) ; MESSERLY 3 (MESS-l) ; MICROSOFT CORP 
(MICT) ; THORPE C A (THOR-l) 

inventor: COKUS S 3 ; DOLAN W B; FEIN R A; FRIES E 3; MESSERLY 3; THORPE C A 
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US 20010021938 Al EN 12 3 Continuation of application US 
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Continuation of patent US 5924108 

Alerting Abstract us Al 

NOVELTY - The individual sentences are scored with the corresponding 
rankings according to respective frequency of content words. A cue-phrase 
analysis is performed by comparing words and phrases in sentences with the 
pre-compiled list. A summary is created based on comparison result. 

DESCRIPTION - INDEPENDENT CLAIMS are also included for the following: 



1. word processing application; 

2. Electronic mail application; 

3. internet web browser application; 

4 . Programmed computer for document summarizing; 

5. Document file 

USE - For summarizing documents for word processing, electronic mail and 
internet web browser applications. 

ADVANTAGE - Enables the author to automatically create summaries using 
the statistical and cue-phrase approach as it is designed from the author's 
standpoint and to place the created summaries conveniently at the top of 
the document, improves the quality of the final summary. 

DESCRIPTION OF DRAWINGS - The figure shows the computer loaded with word 
processing program for performing document summarizer function. 

Title Terms/index Terms/Additional words: computer; based; document; method 
; word; processor; performance; cue; phrase; analyse; compare; specified; 
sentence; pre; compile; list 
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international Classification (+ Attributes) 
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Original Abstracts: 

...summarizer performs a statistical analysis to generate a list of ranked 
sentences for consideration in the summary. The summarizer counts how 
frequently content words appear in a document and produces a table 
correlating the content words with their corresponding frequency counts. 
Phrase compression techniques... 

...inclusion of the sentence have been satisfied. The summarizer then 
inserts the sentence at the beginning of the document before the 
start of the text. . . 

...performs a statistical analysis to generate a list of ranked sentences 
for consideration in the summary . The summarizer counts how frequently 
content words appear in a document and produces a table correlating 
the content words with their corresponding frequency counts. Phrase 
compression techniques are used to produce more accurate counts... 
...inclusion of the sentence have been satisfied. The summarizer then 
inserts the sentence at the beginning of the document before the 
start of the text. 
Claims: 

...A computer-implemented method for summarizing documents, comprising the 
following steps: counting how frequently content words appear in a 
document; scoring individual sentences according to their respective 
content words , wherein sentences which contain more content words that 
appear more frequently in the document are... 

...appear in a document to produce frequency counts for corresponding 
content words; (b) scoring individual sentences according to the content 

words contained in the sentences ; (c) identifying a sentence with 
the highest score; (d) adjusting the frequency counts of the content 



words that appear in the highest scoring sentence to remove an 
influence of the highest scoring sentence ; and (e) re-scoring the 
sentences based on the adjusted frequency counts. 
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Computer aided text translation apparatus used in Japanese to English 
translation - has Japanese and English text search units which search and 
perform translation of input search sentence, based on comparison of both 
languages text stored in bilingual sentence storing unit 

Patent Assignee: FUJI XEROX CO LTD (XERF) 

inventor: MASUICHI H; TATENO M ; TATENO S; UMEKI H; UMEMOTO H 
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Patent Details 

Number Kind Lan pg Dwg Filing Notes 
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JP 3114703 B2 JA 15 Previously issued patent JP 2000020524 



Alerting Abstract jp a 

NOVELTY - A search sentence input unit (2) receives text to be translated 
from Japanese language to English. Based on comparison of both the 
languages, Japanese and English text search units (3,4) search and perform 
the translation of the search sentence. DETAILED DESCRIPTION - A bilingual 
sentence storing unit (1) stores a sentence written in Japanese and its 
translations are written in English correspondingly. 

USE - Used in Japanese to English translation. 

ADVANTAGE - Suitable bilingual sentence search operation is performed, 
without depending on variant in expression of Japanese language. 
Unnecessitates need of dictionary to be produced beforehand. Since 
correspondence relationship between two languages are dynamically acquired 
from extensive bilingual sentence pair information, effect of which 
bilingual sentence search is performed depending on the input search 
question. DESCRIPTION OF DRAWING(S) - The figure shows the components of 
typical computer aided translation apparatus. (1) Bilingual sentence 
storing unit; (2) Search sentence input unit; (3,4) Japanese and English 
text search units. 

Title Terms/index Terms/Additional words: computer; aid; text; translation; 

APPARATUS; JAPAN; ENGLISH; SEARCH; UNIT; PERFORMANCE; INPUT; SENTENCE; 
BASED; COMPARE; LANGUAGE; STORAGE; BILINGUAL 

Class Codes 

international Classification (+ Attributes) 
IPC + Level value Position Status version 

G06F-0017/27 A I R 20060101 

G06F-0017/28 A I R 20060101 

G06F-0017/30 A I R 20060101 

G06F-0017/27 CI R 20060101 

G06F-0017/28 C I R 20060101 

G06F-0017/30 CI R 20060101 
US Classification, issued: 7047, 7048, 7075, 707536 



File Segment: EPl; 
DWPI Class: T01 

Manual Codes (EPl/S-X) : T01-E01C; T01-J05B3; T01-J14 
Original Publication Data by Authority 



Claims: 

...similar to the query from among a set of the first language sentences 
stored in the pair data storing means; andsecond retrieving means for 
retrieving second language sentences similar to second language... 

...retrieving means from among a set of the second language sentences 
stored in the pair data storing means ;wherei n the second retrieving means 
determines and extracts important words from each of... 

...importance, wherein, for the set A of the second language sentences 
stored in the pair data storing means, a set B of the second language 
sentences having the same meaning and paired with the respective first 
language sentences retrieved by the first retrieving means and a set C of 
all words appearing in the set B, a first value that is the number 
of sentences included in the set B, a second value that is the 
number of sentences in the set B for each important word candidate 
containing the important word candidate, supposing... 
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Document summarizer method for word processor 

Patent Assignee: MICROSOFT CORP (MICT) 
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NOVELTY - The content words are compared with precompiled list of words, 
which sets range conditions. A summary (72) is created which contains 
higher ranked sentences and condition satisfying condition. 

DESCRIPTION - The frequency of the content words in a document (70) are 
counted and the sentence which contain more number of content words are 
ranked higher than those sentences which contain fewer high frequency 
content words. An INDEPENDENT CLAIM is also included for the computer for 
summarizing documents. 

USE - For summarizing documents in word processors, electronic mail, 
internet web browser e.g. internet explorer from microsoft corporation. 

ADVANTAGE - Creates summaries using combined statistical and cue-phrase 
approach thus improving the quality of the summary. Enables author to place 
the summary at the top of the document, facilitating the author to revise 
the summary as per his wish. 

DESCRIPTION OF DRAWINGS - The figure shows the documents with summaries 
created. 

70 Document 

72 Summary 
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Original Abstracts: 

. . . summarizer performs a statistical analysis to generate a list of ranked 
sentences for consideration in the summary. The summarizer counts how 
frequently content words appear in a document and produces a table 
correlating the content words with their corresponding frequency counts. 
Phrase compression techniques... 

...inclusion of the sentence have been satisfied. The summarizer then 
inserts the sentence at the beginning of the document before the 
start of the text. 
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INFORMATION PROCESSOR AND PRINTER 

PUB. NO.: 2001-282775 [DP 2001282775 A] 

published: October 12, 2001 ( 20011012) 

INVENTOR(s) : TADA TOMOYUKI 

APPLICANT(s) : OMRON CORP 

APPL. NO.: 2000-090054 [dp 200090054] 

filed: March 29, 2000 (20000329) 

INTL CLASS: G06F-017/21; B41D-021/00; G06F-003/12 

ABSTRACT 

problem TO be solved: To calculate the quantity of sentences 
corresponding to personal information, and to select and lay out an image 
in the size suited to a margin thereof. 

SOLUTION: On the basis of data registered in a personal attribute data base 
32 and a suited condition data base 33, print information is generated 
by a print information synthesizing processing part 31 and outputted to a 
print layout processing part 34. The print layout processing part 34 reads 
sentences matched to the print information from a print sentence data 
base 34, calculates the quantity of print sentences , calculates the 
quantity of margin on the basis of the calculated result of sentence 
quantity , retrieves the image ID of the image in the optimal size from a 
print image data base 37 and selects a correspondent image out of an image 
data group 38 on the basis of that image id. while referring to a print 
layout data base 35, the print layout is determined and outputted to a 
print data output part 39 and data are made to be printed. 

COPYRIGHT: (C) 2001, DPO 

...published: 20011012) 



problem TO be solved: To calculate the quantity of sentences 
corresponding to personal information, and to select and lay out an image 
in the size . . . 

... of data registered in a personal attribute data base 32 and a suited 
condition data base 33, print information is generated by a print 
information synthesizing processing part 31 and outputted to a print... 
... print layout processing part 34 reads sentences matched to the print 
information from a print sentence data base 34, calculates the quantity 
of print sentences , calculates the quantity of margin on the basis of 
the calculated result of sentence quantity , retrieves the image id of 
the image in the optimal size from a print image data base 37 and selects a 
correspondent image out of an image data group 38 on the basis of that 
image ID. while referring to a print layout data... 
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DOCUMENT STRUCTURE DATA BASE CONSTRUCTION PROCESSING SYSTEM 
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JAPIO CLASS : 45.4 (INFORMATION PROCESSING -- Computer Applications) 
JOURNAL : Section: P, Section No. 1746, vol. 18, No. 287, Pg. 128, May 

31, 1994 (19940531) 

ABSTRACT 

purpose: to automatically construct the data base of a document 
structure network on a document structure data base construction processing 
system for constructing the data base of the document structure network 
describing the paragraph constitution of a document. 

CONSTITUTION: A paragraph sentence separation part 11 segmenting text data 
on the input document in the unit of a paragraph sentence , a paragraph 
class feature quantity management part 14 managing a feature quantity 
which a paragraph class classifying the paragraph sentence has, a document 
structure specification part 15 specifying the paragraph class to which the 
respective paragraph sentences segmented by the paragraph sentence 
separation part 11 belong by referring to management data of the paragraph 
class feature quantity management part 14, and specifying the document 
structure network of the input document by specifying connection between 
the paragraph classes and a document structure management part 17 managing 
the document structure network specified by the document structure 
specification part 15 for the respective document businesses are provided. 

. . . PUBLISHED: 19940225) 

ABSTRACT 

purpose: to automatically construct the data base of a document 
structure network on a document structure data base construction processing 
system for constructing the data base of the document structure network 
describing the paragraph constitution of a document... 

... part 11 segmenting text data on the input document in the unit of a 
paragraph sentence , a paragraph class feature quantity management part 
14 managing a feature quantity which a paragraph class classifying the 
paragraph sentence... 
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Semantic annotation providing method for use in data processing system, 
involves dividing data set of sentences into set of corpuses, and 
learning structure of each sentence of corpus using set of trainers 

Patent Assignee: INT BUSINESS MACHINES CORP (IBMC) 
inventor: GAO Y; PICHENY M A; SARIKAYA R 
Patent Family (1 patents, 1 countries) 
Patent Application 
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Alerting Abstract us Al 

NOVELTY - The method involves dividing a data set of sentences into a 
set of corpuses, where each of the corpuses comprises an equal number 
of sentences . A structure of each sentence of the corpus is learned using 
a set of trainers, and a model is formed based on the structure. A new 
sentence is annotated using the model in a set of engines, and the model is 
trained using a parse tree that is annotated by a human annotator 

DESCRIPTION - INDEPENDENT CLAIMS are also included for the following: 

l.a framework for fast semi-automatic semantic annotation, comprising an 
annotation tool 



2. a data processing system for fast semi-automatic semantic annotation, 



comprising a dividing unit 

3. a computer program product in a computer readable medium, comprising a 
set of instructions for dividing a data set of sentences into a 
set of corpuses . 

USE - used for providing a semantic annotation in a data processing 
system (claimed), to train an initial parser. 

ADVANTAGE - The method increases amount of training data provided for 
each round of annotation, so that the parser learns more and makes fewer 
mistakes in annotation each time, and hence minimizing time and cost of 
human annotation required for inspecting and correcting annotated 
sentences, thereby reducing efforts required for human annotation. 

DESCRIPTION OF DRAWINGS - The drawing shows a block representation of a 
semantically annotated sentence. 

522 Pron-sub 

£9/1 Cnhnort- 



540 verb 
550 City 

Title Terms/index Terms/Additional words: method; data; process; system; 
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Semantic annotation providing method for use in data processing system, 
involves dividing data set of sentences into set of corpuses, and 
learning structure of each sentence of corpus using set of trainers 

Alerting Abstract ...novelty - The method involves dividing a data set 
of sentences into a set of corpuses, where each of the corpuses 
comprises an equal number of sentences . A structure of each sentence of 
the corpus is learned using a set of trainers... 

...comprising a dividing unit a computer program product in a computer 
readable medium, comprising a set of instructions for dividing a data 
set of sentences into a set of corpuses . 



...Used for providing a semantic annotation in a data processing system 
(claimed), to train an initial parser. 
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Claims: 

. . .is:<b>l</b>. A method in a data processing system for fast 
semi-automatic semantic annotation, the method comprising: dividing a 
data set of sentences into a plurality of corpuses, wherein each of 
the plurality of corpuses includes an equal number of sentences 
;learning a structure of each sentence of a first corpus using a 
plurality of trainers;forming a model based on the structure; andusing the 
model . . . 

Basic Derwent week: 200630 
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Statistical classifier constructing method, involves constructing binary 
valued feature vectors for sentences in training data, and calculating 
initial word/class probability parameter values based on training data 
feature vectors 
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Patent Family (1 patents, 1 countries) 
Patent Application 
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Alerting Abstract us Al 

NOVELTY - The method involves receiving labeled training data (308) with 
text sentences labeled by a class, and calculating an initial class 
probability parameter values for each class based on the training data 
sentences. A set of binary valued feature vectors (318) are constructed 
for sentences in the training data, and an initial word/class probability 
parameter values are calculated based on the training data feature vectors. 

DESCRIPTION - INDEPENDENT CLAIMS are also included for the following: 

1. a computer readable medium including instructions readable by a 
computer for constructing a statistical classifier 

2. a speech utterance classification system comprising a speech utterance 
classification engine. 

USE - used for constructing a statistical classifier on a natural 
language interface. 

ADVANTAGE - The method effectively improves the model performance of a 
statistical classifier with a faster convergence speed. 

DESCRIPTION OF DRAWINGS - The drawing shows a block diagram of a system 
for constructing a statistical classifier. 

302 Classifier construction module 

308 Training data 

310 Pre-processing module 

318 Binary valued feature vector 

322 initialization module 

Title Terms/index Terms/Additional words: statistical; classify; 
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data; calculate; initial; word; class; probability; parameter; based 
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...and calculating an initial class probability parameter values for each 
class based on the training data sentences. A set of binary valued 
feature vectors (318) are constructed for sentences in the training data, 
and . . . 

Original Publication Data by Authority 



t_iaims: 

...calculating an initial class probability parameter thetay values for 
each class y based on the number of training data sentences having 
the corresponding class label ; constructing a set of binary valued feature 



vectors for sentences in the training data , each set of feature 
vectors corresponding to a class label, each feature vector corresponding 
to a sentence, each feature corresponding to a word kjcalculating 
initial word/class probability parameter thetaky values based on training 
data feature. . . 
Basic Derwent week: 200630 
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word boundary probability estimating device for use in e.g. spelling 
checking, has estimator estimating probability of boundary existing in set 
of characters by referring to calculated probability between another set of 
characters 

Patent Assignee: IBM CORP (IBMC) ; INT business MACHINES CORP (IBMC) 

inventor: MORI S; TAKUMA D 

Patent Family (2 patents, 2 countries) 

Patent Application 

Number Kind Date Number Kind Date update 

US 20060015326 Al 20060119 US 2005180153 A 20050713 200612 B 
DP 2006031295 A 20060202 HP 2004207864 A 20040714 200612 E 

Priority Applications (no., kind, date): JP 2004207864 A 20040714 

Patent Details 

Number Kind Lan Pg Dwg Filing Notes 

US 20060015326 Al EN 19 12 
HP 2006031295 A JA 19 

Alerting Abstract us Al 

NOVELTY - The device has a calculator for calculating a probability of a 
word boundary existing between a set of characters that constitute a 
character string stored in a corpus by invoking information. An estimator 
estimates the probability of the boundary existing in another set of 
characters that constitute another string stored in another corpus by 
referring to the calculated probability between the latter set of 
characters . 

DESCRIPTION - INDEPENDENT CLAIMS are al so included for the following: 

1. an unknown word model building method; 

2. a word boundary probability estimating method; 

3. stored software. 

USE - For estimating a probability of a word boundary, in kana-kanji 
conversion, spelling checking, optical character recognition and speech 
recognition technique. 

ADVANTAGE - The configuration of the device improves the accuracy of 
recognition in natural language processing. 

DESCRIPTION OF DRAWINGS - Tne drawing shows a kana-kanji converting 
device . 

22 Language decoding section 
30 Base form pool 
300 vocabulary dictionary 
302 Character dictionary 
320, 322 Corpuses 
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Original Abstracts: 

...a relatively large corpus, are given as a training corpus that is 
storage containing vast quantities of sample sentences . vocabulary 
including contextual information is expanded from words occurring in 
first corpus of relatively small size to words occurring... 
Claims: 

...containing the first character string comprising the first plurality of 
characters or setting up the preliminary information as to whether the 
word boundary exists, andmeans for estimating the probability that the 
word boundary will exist in a second plurality of characters constituting 
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Sentences generating method for use by e.g. writer, involves generating 
list of words for each word present in input source from attached 
repositories in particular language, and combining all generated lists to 
generate sentences 

Patent Assignee: BEHBEHANI H (BEHB-l) 
inventor: BEHBEHANI H 
Patent Family (1 patents, 1 countries) 
Patent Application 

Number Kind Date Number Kind Date Update 

US 20050120002 Al 20050602 US 2003507518 P 20031002 200542 B 
US 2004939353 A 20040914 

Priority Applications (no., kind, date): US 2003507518 P 20031002; us 
2004939353 A 20040914 

Patent Details 

Number Kind Lan Pg Dwg Filing Notes 

US 20050120002 Al EN 11 3 Related to Provisional US 2003507518 
Alerting Abstract us Al 

NOVELTY - The method involves analyzing a source text, and extracting 
words from a source. A list of words is generated for each word present in 
an input source from attached repositories in a particular language based 
on desired retrieval mechanism such as predefine lists, aliases and 
synonyms. The list is displayed, and a set of desired words is selected 
from the list. All the generated lists are combined to generate sentences. 

DESCRIPTION - An INDEPENDENT CLAIM is also included for a service for 
generating sentences. 

USE - used for generating sentences by publishing organization, writer, 
author, lecturer, teacher and institution to provide textbook and 
courseware according to students interests and details, medical research 
doctor for combining medicines, diseases and symptoms, and for UPSTO 
employee, USPTO" customer, artistic research, creative activity and brain 
storming. 

ADVANTAGE - The method facilitates searching of custom repositories such 
as documents and databases, in an easy manner. The method helps users who 
want to search some repositories which are not in their respective 
languages. The method is useful not only for corporate entities but also 
for individuals. 

DESCRIPTION OF DRAWINGS - The drawing shows an inner working of a process 
of generating sentences. 
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Original Abstracts: 

...process of text generation/creation is automated. The text to be 
processed is used as seed for the text generation process. The text 
to be processed can be in any language and can be passed to text generation 

ciaims: 

...Text means data in specific format. It may be single or multiple 
sentences, characters, words, numbers , formulae and expressions. .. etc 
sentences and text are used alternatively .Alias means a unique name for 
accessing multiple lists. An alias can be created by combining multiple 
predefined lists thereby and all the entries are all the lists are 
accessed attached with a alias. output device... 
Basic Derwent week: 200542 



21/5 ,K/7 (Item 5 from file: 350) 

DIALOG (r) Fi 1 e 350: Derwent WPIX 

(c) 2008 The Thomson Corporation. All rts. reserv. 

0013881346 - Drawing available 
WPI ACC NO: 2004-060250/ 200406 
XRPX Acc No: N2004-048721 

Computerized medical information management system for hospital, uses 
set of universal templates for entering medical data and intelligent data 
fields that adapt to user input automatically 

Patent Assignee: BURSTEIN A (BURS-l) ; BURSTEIN B (BURS-l) 
inventor: BURSTEIN A; BURSTEIN B 
Patent Family (1 patents, 1 countries) 
Patent Application 

Number Kind Date Number Kind Date update 

US 20030220819 Al 20031127 US 2002151155 A 20020521 200406 B 

Priority Applications (no., kind, date): US 2002151155 A 20020521 

Patent Details 

Number Kind Lan Pg Dwg Filing Notes 

US 20030220819 Al EN 23 14 

Alerting Abstract us Al 

NOVELTY - The management software uses a minimum set of intelligent 
templates that display medical conditions in graphical user interface (GUI) 
and intelligent data fields that adapt to user input automatically. The 
system includes modules for tracking patients, indicating occupancy of bed, 
and to output billing statements, insurance/management reports, and 
additional administrative documents in grammatically accurate, 
understandable phrases. 

DESCRIPTION - An INDEPENDENT CLAIM is also included for computer program 
product comprising readable medium storing instructions for managing 
medical information. 

USE - in hospital for medical management using intranet and internet. 

ADVANTAGE - Eliminates transcription and simplifies the task of recording 
symptoms diagnosis and patient history. 



DESCRIPTION OF DRAWINGS - The figure shows a close-up, partial screen 
shot of the on-screen data entry form. 

Title Terms/index Terms/Additional words: computer; medical; information; 
management; system; hospital; set; universal; template; enter; data; 
intelligence; field; adapt; user; input; automatic 

Class Codes 

international Classification (Main): G06F-017/60 
US Classification, issued: 7053 



File Segment: EPl; 
DWPI Class: S05; T01 

Manual Codes (EPl/S-X) : S05-G02G1; T01-306A1; T01-J12B; T01-N02A2; T01-S03 

Computerized medical information management system for hospital, uses 
set of universal templates for entering medical data and intelligent data 
fields that adapt to user... 

Original Publication Data by Authority 



Claims: 

...for executing program code under the direction of the processor, a 
storage device for storing data and program code and a bus connecting the 
processor and the storage device; e) a... 

...means for creating, from data entered by the user, reports in natural 
language consisting of grammatical , readily understood sentences and 
phrases; <b>3</b>) intelligent data fields that adapt according to data 
previously entered by the... 
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webpage reading method for sight-impaired users, involves reading webpage 
from initial reading position according to user- configurable settings 

Patent Assignee: INT BUSINESS MACHINES CORP (IBMC) 
inventor: CRAGUN B 3 

Patent Family (2 patents, 1 countries) 
Patent Application 

Number Kind Date Number Kind Date update 

US 20030172353 Al 20030911 US 200293159 A 20020307 200377 B 
US 7058887 B2 20060606 US 200293159 A 20020307 200638 E 

Priority Applications (no., kind, date): US 200293159 A 20020307 

Patent Details 
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US 20030172353 Al EN 21 10 

Alerting Abstract us Al 

NOVELTY - The method involves determining a set of user-configurable 

settings for reading the webpage. An initial reading position on the 

webpage is determined based on the user-configurable settings. The webpage 
is then read from the initial reading position according to the 
user-configurable settings. 

DESCRIPTION - INDEPENDENT CLAIMS are also included for the following: 

1. computer readable medium containing program for reading webpage; and 

2. computer program product for reading webpage. 

USE - For programmatically reading webpage using personal digital 
assistant (PDA), wireless device for sight-impaired users. 



ADVANTAGE - By selecting initial display position in a document, optimum 
use of devices with limited display area and communication bandwidth is 
enabled. 

DESCRIPTION OF DRAWINGS - The figure the flowchart illustrating the 
operation of webpage reader program. 

Title Terms/index Terms/Additional words: read; method; sight; impair; user 
; initial; POSITION; ACCORD; CONFIGURATION; set 

Class Codes 

international Classification (+ Attributes) 
IPC + Level value Position Status version 

G06F-0015/00 A I F B 20060101 

G06F-0015/00 A I R 20060101 

G06F-0015/00 C I L B 20060101 

G06F-0015/00 CI R 20060101 
US Classification, issued: 715517, 715517, 715523, 704270.1, 709201 



File Segment: EPl; 
DWPI Class: S05; T01 

Manual Codes (EPl/S-X) : S05-K; TOl-JllC; T01-N03A1; T01-S03 
Original Publication Data by Authority 



Original Abstracts: 

A method and apparatus for reading a web page according to a set of 
user-configurable settings . in one embodiment, a set of user-configurable 
settings configured for reading the web page is determined. An initial 
reading position on the web page is determined as specified by the 
user-configurable settings. The web page is then read... 

...A method and apparatus for reading a web page according to a set of 
user-configurable settings. In one embodiment, a set of 
user-configurable settings configured for reading the web page is 
determined. An initial reading position on the web page is determined as 
specified by the user-configurable settings. The web page is then read from 
the initial reading position according to the set of 
user-configurable settings. 
Claims: 

what is claimed is:<b>l</b>. A method of reading a web page according to 
a set of user-configurable settings, comprising determining the set 
of user-configurable settings; determining an initial reading position on 
the web page as specified by the set of user-configurable settings; 
andreading, by a reading program, the web page from the initial reading 
position according to the set of user-configurable settings. 



...what is claimed is:l. A computer-implemented method of reading a web 
page according to a predefined set of user-configurable settings, 
comprising: upon retrieving the web page , selecting a setting from the 
set of user-configurable settings on the basis of an attribute of the 
web page , wherein the attribute is at least one of content of the web 

page and a url of the web page ; determining an initial reading 
position on the web page as specified by the selected setting; andreading, 
by a reading program, the web page from the initial reading position 
according to the set of user-configurable settings, wherein the 
user-configurable settings are at least one of: a URL setting configured to 
identify the web page on the basis of the URL; a link page setting 
configured to identify the web page as a link page dependent on a 
quantification. . . 

...to identify the web page as an overview page dependent on a 
quantification of a number of sentences of readable text in the web 
page. Basic Derwent week: 200377 
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Redundant information removal method for digital document, involves 
organizing document text into sentences and paragraphs, and comparing 
organized document with other documents to identify redundancies 

Patent Assignee: BOREK S E (BORE-l) ; BOURBAKIS N G (BOUR-l) ; US SEC OF 

AIR FORCE (USAF) 
inventor: BOREK S E; BOURBAKIS N G 
Patent Family (2 patents, 1 countries) 
Patent Application 

Number Kind Date Number Kind Date update 

US 20030145279 Al 20030731 US 2002351636 p 20020125 200367 B 

US 2002314189 A 20021205 
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Patent Details 
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US 20030145279 Al EN 12 6 Related to Provisional US 2002351636 
Alerting Abstract us Al 

NOVELTY - The text of original digital document retrieved from the 
database (100) is organized into sentences and paragraphs. The organized 
document is analyzed and compared with other documents to identify the 
redundancies present in the documents, using information redundancy removal 
(IRR) software (140) . 

DESCRIPTION - An INDEPENDENT CLAIM is also included for apparatus for 
removing redundant information from digital document. 

USE - For removing redundant information such as paragraph of text or 
images from original document such as web pages, for reconstruction of new 
document related to government organization. 

ADVANTAGE - Removes redundant information from retrieved documents using 
simple technique. 

DESCRIPTION OF DRAWINGS - The figure shows the block diagram of the 
redundant information removal process. 
100 database 
120 search engine 
140 IRR software 
180 new document 
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Original Abstracts: 

Method and apparatus for reconstructing new documents from a group of 
old ones by removing the existing redundant information . Redundant 
information (images, text paragraphs) from retrieved multimedia documents 
is removed. Each document consists of two main parts stored in different 
databases. The first part of a document represents text paragraphs, the... 

...paragraphs, by keeping pointers useful for a future reconstruction of 



the original documents. The remaining text paragraphs and the set of 
points are used to compose the first version of a new document. The 
invention also examines all the images related with the set of original 
documents and removes the same or similar images while keeping 
pointers that could assist a future reconstruction of the original... 

...Method and apparatus for reconstructing new documents from a group 
of old ones by removing the existing redundant information . Redundant 
information (images, text paragraphs) from retrieved multimedia 
documents is removed. Each document consists of two main parts stored 
in different databases. The first part of a document represents text 
paragraphs, the second part consists of the images and drawings related 
with the . . . 

...paragraphs, by keeping pointers useful for a future reconstruction of 
the original documents. The remaining text paragraphs and the set of 
points are used to compose the first version of a new document . The 
invention also examines all the images related with the set of original 
documents and removes the same or similar images while keeping 
pointers that could assist a future reconstruction of the original 
documents. The invention merges text... 
Claims: 

...of a paragraph in characters ; character hi stograms ; number of words in 
each sentence ;word histograms; starting word of each sentence; andending 
word of a paragraph; determining whether similar said statistical... 

. . .THENdeciding paragraphs are similar , removing redundant paragraph, 
andproceeding to said step of comparing said sentences and paragraphs 
with other documentsOTHERWlSE , postponing removal of paragraph;analyzing 
corresponding image and data... 
Basic Derwent week: 200367 



21/5, K/10 (item 8 from file: 350) 
DIALOG (R) File 3 50: Derwent WPIX 

(c) 2008 The Thomson Corporation. All rts. reserv. 

0009670004 - Drawing available 
WPI ACC NO: 1999-623996/ 199954 
XRPX Acc No: N1999-460741 

Text structure analysis apparatus for documentation apparatus - has tree 
structure determining unit to determine tree structure, based on degree of 
importance calculated between sentences 

Patent Assignee: SHARP KK (SHAF) 
inventor: OKUNISHI T; YAM A J I T; YOSHIMI T 
Patent Family (3 patents, 2 countries) 
Patent Application 

Number Kind Date Number Kind Date update 
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Patent Details 
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Alerting Abstract dp a 

NOVELTY - The input text is divided into sentence and stored in memory 
(8). Relation degree calculator (4) calculates relation between main 
concept and the sentence that are stored in memory. Based on their relation 
importance degree, calculator (5) calculates the degree of importance of 
the sentence, based on which tree structure of input text is determined. 
DETAILED DESCRIPTION - An output unit (7) displays the obtained tree 
structure of the text. Essential word recognition unit (2) recognizes 
essential word from each row and stores it in memory. An INDEPENDENT CLAIM 
is also included for analysis program recording medium. 

USE - For documentation apparatus. 

ADVANTAGE - As text are extracted based on their degree of importance, an 
accurate text structure analysis is performed. DESCRIPTION OF DRAWING (S) - 



The figure shows the text structure analysis apparatus. (2) word 
recognition unit; (4,5) Calculators; (7) Output unit; (8) Memory. 
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Original Abstracts: 

A text input section (<b>l</b>) divides an inputted text into sentences and 
attaches a number to each of the sentences , which is stored in a 
text data base together with the number. An important word recognizing 
section (<b>2</b>) generates a list of important words... 
Basic Derwent week: 199954 
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Automatic music composing apparatus - composes music based on extracted 
music templates which have characteristic information which are in accord 
with input conditions 

Patent Assignee: YAMAHA CORP (NIHG) 
inventor: AOKI E; SUGIURA T 
Patent Family (3 patents, 2 countries) 
Patent Application 

Number Kind Date Number Kind Date update 
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Alerting Abstract dp a 

NOVELTY - The search unit extracts the music template containing the 
characteristic information which are in accord with the conditions input by 
user, and composes music based on the extracted templates. DETAILED 
DESCRIPTION - An INDEPENDENT CLAIM is also included for recording medium 
which stores software for music composition. 

USE - For composing music automatically. 

ADVANTAGE - Extract of many templates is performed. Riot of music is 
securable. The odd sound caused by making number of phrases and number of 
nodulus in accord with input conditions is suppressed. DESCRIPTION OF 
DRAWING(S) - The figure shows block diagram of automatic music composing 
apparatus . 



Title Terms/index Terms/Additional words: automatic; music; compose; 

APPARATUS; BASED; EXTRACT; TEMPLATE; CHARACTERISTIC; INFORMATION; ACCORD; 
INPUT; CONDITION 

Class Codes 

international Classification (Main): GlOH-001/00 
international Classification (+ Attributes) 
IPC + Level value Position Status version 

G10H-0001/00 A I R 20060101 

G10H-0001/00 C I R 20060101 
US Classification, issued: 84609, 84610, 84634, 84649, 84650 



File Segment: EngPl; EPI ; 

DWPI Class: W04; P86 

Manual Codes (EPl/S-X) : W04-U 

Original Publication Data by Authority 



Original Abstracts: 

...a musical template data base storing a plurality of musical templates 
each including a first set of data constituting a musical melody sample 
defined by a pattern of musical tone pitch progression in a pattern of 
rhythm to be performed for a musical piece and a second set of data 
indicative of musical features of said musical melody sample. The melody 
sample is constructed and... 

...to define musical features for a musical piece to be composed in terms 
of the number of sentences , phrases and measures, and similarity 
symbols of each sentence. Comparing the structure and the features... 
Claims: 

An automatic music composing apparatus comprising: a template data base 
storing a plurality of musical templates each including a set of data 
defining a musical piece , said musical piece being subdivided into a 
plurality of musical segments, said set of data including subsets of 
data respectively defining musical properties of said musical 
segments ; input means for inputting composition conditions including 
requirements on musical properties... 
Basic Derwent Week: 199928 



21/5.K/12 (Item 10 from file: 350) 
DIALOG (R) Fi 1 e 350: Derwent WPIX 

(c) 2008 The Thomson Corporation. All rts. reserv. 

0009075567 - Drawing available 
WPI ACC NO: 1998-496375/ 199843 
XRPX Acc No: N1999-033815 

Computer based generation method of thematic summary from document image - 
involves selecting set of thematic sentences based on their score which is 
implemented by value related to frequency of occurrence of thematic word 
image in document 
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Patent Family (2 patents, 2 countries) 
Patent Application 
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Alerting Abstract us a 

The method involves analysing the document image to identify sentence 
boundaries and to identify multiple word image equivalence class. 



Predetermined number of word image equivalence class is selected as 
thematic word images, the number being lesser than number of thematic 
sentence to be interacted. 

Based on occurrence of thematic word images, in the sentences, each 
sentence is scored. A set of thematic sentences are selected based on the 
score. The score of the sentence is incremented by value related to 
frequency of occurrence of thematic word image in the document. 

USE - For generating thematic summaries without performing character 
recognition . 

ADVANTAGE - Produces readable and semantically correct thematic summary 
from document image. 
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Original Abstracts: 

...into text blocks, and text lines, using the median x-height of text 
blocks the main body of text is identified. Afterward, word image 
equivalence classes and sentence boundaries within the blocks of the main 
body of text are determined. The word image equivalence classes are 
used to identify thematic words. These, in turn are used to score the 
sentences within the main body of text , and the highest scoring 
sentences are selected for extraction. 
Claims: 

...first number of word image equivalence classes, the first number being 
less than a second number of thematic sentences to be extracted; d) 
scoring each sentence of the first multiplicity of sentences based upon 
occurrence of thematic word images in each sentence; and e) selecting the 
second number of thematic sentences from the first multiplicity of 
sentences based upon the score of each sentence. 
Basic Derwent week: 199843 
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Logic circuit simulator for logic circuit defined by sentence - has 
sentence calculating device that carries out calculation on one of number 
of sentences one at time and output result of calculation based on one of 
operator 

Patent Assignee: NEC CORP (NIDE) 
inventor: TAKASAKI S 

Patent Family (1 patents, 1 countries) 
Patent Application 

Number Kind Date Number Kind Date update 

US 5689683 A 19971118 US 1990486705 A 19900228 199801 B 

US 199374725 A 19930610 
US 1995432260 A 19950501 



Priority Applications (no., kind, date): DP 198948225 A 19890228; 3 P 
1989131079 A 19890524; DP 1989166926 A 19890630; 3 P 1989318102 A 
19891207 

Patent Details 

Number Kind Lan Pg Dwg Filing Notes 

US 5689683 A EN 25 18 Continuation of application US 

1990486705 

Division of application US 199374725 
Division of patent US 5572708 

Alerting Abstract us a 

The system includes a model memory for memorising a number of operators 
which are for carrying out operations specified by the sentences. A 
variable memory memorises a number of initial values of the variables 
specified by the sentences. A sentence calculating device is connected to 
the model memory and the variable memory to carry out calculation on one of 
the number of sentences one at a time and output a result of the 
calculation based on at least one of the operators and at least two of the 
initial values of the variables for each calculation of the number of 
sentences . 

A data memory is connected to the sentence calculating device to memorise 
the results of the calculations for the number of sentences . A 
substituting device is connected to the sentence calculating device and the 
data memory to substitute the result of calculation for a previous result 
that was previously calculated according to the one of the sentences. 

ADVANTAGE - Capable of dealing with description of functional level. 

Title Terms/index Terms/Additional words: LOGIC; CIRCUIT; SIMULATE; DEFINE; 
sentence; calculate; device; carry; one; number; time; output; result; 
based; operate 

Class Codes 

international Classification (+ Attributes) 
IPC + Level value Position Status version 

G06F-0017/50 A I R 20060101 

G06F-0017/50 C I R 20060101 
US Classification, Issued: 395500, 364489, 364578 



File Segment: EPl; 
dwpi Class: T01 

Manual Codes (EPl/S-x) : T01-G06; T01-J15A 

...has sentence calculating device that carries out calculation on one of 
number of sentences one at time and output result of calculation based 
on one of operator 

Alerting Abstract ...the model memory and the variable memory to carry 
out calculation on one of the number of sentences one at a time and 
output a result of the calculation based on at least... 

...at least two of the initial values of the variables for each calculation 
of the number of sentences . 



...connected to the sentence calculating device to memorise the results of 
the calculations for the number of sentences . A substituting device is 
connected to the sentence calculating device and the data memory to 

Original Publication Data by Authority 



Original Abstracts: 

...are related to the current sentence. A data or result memory memorizes 
previous data or initial result values calculated before calculation of 
the current sentence. The result of calculation is substituted for those of 
the previous data or the initial result values which are related to the 
current sentences. Preferably, a flag memory is used to... 
Basic Derwent week: 199801 
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Anticipated meaning natural language interface system for computer 
application - involves anticipating general meaning of each of number of 
likely user input sentences and storing in computer number of general 
meaning nodes, one for each anticipated user input general meaning 

Patent Assignee: CONRAD D (CONR-l) ; COSBY C (COSB-l) 

inventor: CONRAD D; COSBY C 

Patent Family (1 patents, 1 countries) 

Patent Application 

Number Kind Date Number Kind Date update 

US 5682539 A 19971028 US 1994315240 A 19940929 199749 B 

Priority Applications (no., kind, date): US 1994315240 A 19940929 

Patent Details 

Number Kind Lan Pg Dwg Filing Notes 

US 5682539 A EN 32 28 

Alerting Abstract us a 

The method involves anticipating the general meaning of each of a number 
of likely user input sentences and storing in the computer a number of 
general meaning nodes, one for each anticipated user input general meaning. 
Each node is associated with a function, at least one typical anticipated 
user-input sentence which conveys the general meaning of the node is 
entered. A pattern is generated from the words of the typical sentence, and 
the typical sentence pattern is stored in the computer. A user an input 
sentence is received and a pattern is generated from the words of the input 
sentence. An algorithm stored in the computer is applied to select which 
one of the number of general meaning nodes is intended by the user by 
comparing the input sentence pattern to the typical sentence patterns. The 
function associated with the selected general meaning node is executed. 

ADVANTAGE - Allows knowledge engineer to build system that recognises any 
language or combination of language received from any source e.g keyboard 
or voice recognition. 

Title Terms/index Terms/Additional words: ANTICIPATE; MEANING; NATURAL; 
language; interface; system; computer; apply; general; number; user; 
input; sentence; storage; node; one 

Class Codes 

international Classification (+ Attributes) 
IPC + Level value Position Status version 

G06F-0017/28 A I R 20060101 

G06F-0017/28 C I R 20060101 
US Classification, issued: 395759 

Alerting Abstract ...The method involves anticipating the general meaning 
of each of a number of likely user input sentences and storing in the 
computer a number of general meaning nodes, one for each anticipated... 

Original Publication Data by Authority 



Original Abstracts: 

...it is abstracted by the system and compared to abstracted typical 
sentences in the knowledge base . This information , and other 
available information , is used by an algorithm to determine which of the 
general meaning nodes is intended... 
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TRANSLATION JUDGMENT DEVICE, METHOD, AND PROGRAM 
TRANSLATIONSBEURTEILUNGSEINRICHTUNG, VERFAHREN UND PROGRAMM 
DISPOSITIF, PROCEDE ET PROGRAMME D' EVALUATION DE TRADUCTION 

PATENT ASSIGNEE: 

Laboratory for Language Technology, (7037850), Incorporated 14-6-101 
Hama-cho, Ashiya-shi Hyogo 659-0025, (DP), (Applicant designated States: 
all) 
INVENTOR: 

DACOBSON, YokoLab. for Language Technology inc., 4-6-101, Hama-cho, 
Ashiya-shi , Hyogo 6590025, (3P) 
LEGAL REPRESENTATIVE: 

Fuhlendorf, Horn (3931), Patentanwalte Dreiss, Fuhlendorf, Steimle & 
Becker, Postfach 10 37 62, 70032 Stuttgart, (DE) 
PATENT (CC, No, Kind, Date): EP 1703419 Al 060920 (Basic) 

WO 2005059771 050630 
APPLICATION (CC, No, Date): EP 2004792480 041015; WO 2004JP15263 041015 
PRIORITY (CC, No, Date): JP 2003416778 031215 
DESIGNATED STATES: DE; ES; FR ; GB ; IT; NL 
EXTENDED DESIGNATED STATES: AL ; HR; LT; LV; MK 
INTERNATIONAL PATENT CLASS (V7) : G06F-017/28 
INTERNATIONAL CLASSIFICATION (V8 + ATTRIBUTES): 
IPC + Level value Position Status version Action Source Office: 

G06F-0017/28 A I F B 20060101 20050707 H EP 
ABSTRACT WORD COUNT: 242 
NOTE: 

Figure number on first page: 2 

LANGUAGE (Publ i cati on , Procedu ral , Appl i cati on) : English; English; Japanese 
FULLTEXT AVAILABILITY: 

Available Text Language update word Count 

CLAIMS A (English) 200638 1767 

SPEC A (English) 200638 22204 
Total word count - document A 23971 
Total word count - document B 0 
Total word count - documents A + B 23971 

...SPECIFICATION in Step 108 are not counted twice even if the word 
reappears in a natural sentence , in order to avoid repetitively 
counting the coinciding words that appeared twice or more. 

Thus even if the same coinciding word exists in multiple places of... 
12/3, K/2 (item 2 from file: 348) 

DIALOG (R) File 348: EUROPEAN PATENTS 

(c) 2008 European Patent Office. All rts. reserv. 

01784625 

Document and pattern clustering method and apparatus 
Dokument- und Mustergruppierungsverfahren und -Anordnung 
Procede et dispositif de regroupement des documents et des formes 

PATENT ASSIGNEE: 

Hewlett-Packard Development Company, L.P., (4337790), 20555 S.H. 249, 
Houston, TX 77070, (US), (Applicant designated States: all) 
INVENTOR: 

Kawatani , Takahiko, 1950-21-3-515, Mutsuura-cho, Kanazawa-Ku, Yokahama 
Kanagawa 236-0032, (DP) 
LEGAL REPRESENTATIVE: 

Powell, Stephen David et al (52311), WILLIAMS POWELL Morley House 26-30 
Holborn viaduct, London EClA 2BP, (GB) 
PATENT (CC, No, Kind, Date): EP 1455285 A2 040908 (Basic) 
EP 1455285 A2 040908 
EP 1455285 A3 061220 
APPLICATION (CC, No, Date): EP 2004251279 040305; 
PRIORITY (CC, No, Date): J P 2003105867 030305; 3 P 200430629 040206 
DESIGNATED STATES: AT; BE; BG ; CH ; CY; CZ ; DE; DK; EE; ES; Fl; FR; GB; GR; 



hu; ie; it; li ; lu; MC; NL; PL; pt; ro; SE; si; SK; tr 

EXTENDED DESIGNATED STATES: AL ; LT; LV; MK 

INTERNATIONAL PATENT CLASS (V7) : G06F-017/30 

INTERNATIONAL CLASSIFICATION (V8 + ATTRIBUTES): 

IPC + Level value Position Status version Action Source Office: 

G06F-0017/30 A I F B 20060101 20040705 H EP 

G06K-0009/62 A I L B 20060101 20061110 H EP 
ABSTRACT WORD COUNT: 112 
NOTE: 

Figure number on first page: 1 

LANGUAGE (Publ i cati on , Procedu ral , Appl i cati on) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language update word Count 

CLAIMS A (English) 200437 1874 

SPEC A (English) 200437 6522 

Total word count - document A 8398 

Total word count - document B 0 

Total word count - documents A + B 8398 

Document and pattern clustering method and apparatus 

. . .ABSTRACT A3 

in document (or pattern) clustering , the correct number of 
clusters and accurate assignment of each document (or pattern) to the 
correct cluster are attained. Documents (or patterns) describing the 
same topic (or object) are grouped , so a document (or pattern) group 

belonging to the same cluster has some commonality. Each topic (or 
object) has distinctive terms (or object features) or term (or object 
feature) pairs, when the closeness or each document (or pattern) to a 
given cluster is obtained, common information about the given 
cluster is extracted and used while the influence of terms (or object 
features) or term (or... 

...SPECIFICATION M denote the number of kinds of the occurring terms, Dr)) 

denote the r-th document in a document set D consisting of R 

documents , Yr)) denote the number of sentences in document Dr)), and 

dry)) = (dryl)) dryM))) T) denote... 

...equation (1), the mn components of Sr) are given by 

Therefore, Sr)mm)) represents the number of sentences in which term 
m occur and Sr)mn)) represents the co - occurrence counts of 
sentences in which terms m and n co-occur, if each term does not occur 
twice or more in. . . 

...U0) that stores the document frequencies of each term and each term in 
the input document set is obtained. Matrices u0)mm)) and u0)mn)) 
respectively denote the number of documents in... 
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(c) 2008 European Patent Office. All rts. reserv. 

01274197 

INFORMATION PROCESSING DEVICE AND INFORMATION PROCESSING METHOD, AND 
RECORDING MEDIUM 

INFORMATIONSVERARBEITUNGSVORRICHTUNG UND IN FORMATION SVERARB E ITU NG SVE RFAH REN 

UND AUFNAHMEMEDIUM 
PROCEDE ET DISPOSITIF INFO RM ATI QU E ET SUPPORT D'ENREGISTREMENT 

PATENT ASSIGNEE: 

Sony Corporation, (214028), 7-35, Kitashinagawa 6-chome, Shi nagawa-ku , 
Tokyo 141-0001, Op), (Applicant designated States: all) 
INVENTOR: 

IWAHASHI, Naoto Sony Computer Science Lab. inc., 3-14-13, Higashi-Gotanda 
Shi nagawa-ku , Tokyo 141-0022, Op) 
LEGAL REPRESENTATIVE: 

Robinson, Nigel Alexander Julian (69551), D. Young & Co., 21 New Fetter 
Lane, London EC4A Ida, (gb) 
PATENT (CC, No, Kind, Date): EP 1146439 Al 011017 (Basic) 

WO 200116794 010308 
APPLICATION (CC, No, Date): EP 2000956860 000831; WO 2000JP5938 000831 



PRIORITY (CC, No, Date): 3 P 99245461 990831 
DESIGNATED STATES: DE; FR; GB ; NL 

EXTENDED DESIGNATED STATES: AL; LT; LV; MK; RO; SI 
INTERNATIONAL PATENT CLASS (V7) : G06F-017/28 
ABSTRACT WORD COUNT: 104 
NOTE: 

Figure number on first page: 1 

LANGUAGE (Publ i cati on , Procedu ral , Appl i cati on) : English; English; Japanese 
FULLTEXT AVAILABILITY: 

Available Text Language update word Count 

CLAIMS A (English) 200142 995 

SPEC A (English) 200142 9049 
Total word count - document A 10044 
Total word count - document B 0 
Total word count - documents A + B 10044 

...SPECIFICATION anal ogousness . in addition, in the method using 
co-occurrence information, with respect to a large number of sentences 

co -occurrence information of words appearing in those sentences 
are registered. Thus, the word anal ogousness is determined on the basis 
of. . . 
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METHOD AND SYSTEM FOR RETRIEVING RELEVANT DOCUMENTS FROM A DATABASE 
METHODE UND VERFAHREN UM RELEVANTE DOKUMENTE IN EINER DATENBANK ZU FINDEN 
PROCEDE ET SYSTEME POUR L' EXTRACTION DE DOCUMENTS PERTINENTS D'UNE BASE DE 
DONNEES 

PATENT ASSIGNEE: 

KCSL, inc., (2910941), Suite 1012, 5160 Yonge Street, Toronto, Ontario 
M2N 6l9, (CA), (Proprietor designated states: all) 
INVENTOR: 

KAUFMAN, Ilia, 18 Brandy Court, Toronto, Ontario M3B 3l3, (CA) 

LEGAL REPRESENTATIVE: 

Boyce, Conor et al (74271), F. R. Kelly & Co., 27 Clyde Road, Ballsbridge 
, Dublin 4, (IE) 
PATENT (CC, No, Kind, Date): EP 1086432 Al 010328 (Basic) 
EP 1086432 Bl 040407 
WO 1999064964 991216 
APPLICATION (CC, No, Date): EP 99924619 990607; WO 99CA531 990607 
PRIORITY (CC, No, Date): US 88483 P 980608 

DESIGNATED STATES: AT; BE; CH ; CY; DE; DK ; ES ; FI ; FR; GB; GR; IE; IT; LI; 
LU; MC; nl; pt; se 

INTERNATIONAL PATENT CLASS (V7) : G06F-017/30 
NOTE: 

No A-document published by EPO 
LANGUAGE (Publ i cati on , Procedu ral , Appl i cati on) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language update word Count 

CLAIMS B (English) 200415 779 

CLAIMS B (German) 200415 731 

CLAIMS B (French) 200415 857 

SPEC B (English) 200415 6447 
Total word count - document A 0 
Total word count - document B 8814 
Total word count - documents A + B 8814 

...SPECIFICATION sentence quantizer 60 then calculates the sum where the 
sum is over only those query- words that are present in the particular 
sentence si)). 

From these quantities , the sentence quantizer 60 calculates a 
position-independent sentence similarity using the following formula: 
where #w(Si . . . 
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Method for determining the semantic relatedness of lexical items in a text, 
verfahren zur Bestimmung der semantischen verwandtschaft zwischen lexikalen 

Einzelheiten in einem Text. 
Methode pour determiner la parente semantique entre des unites lexicales 

dans un texte. 

PATENT ASSIGNEE: 

BSO/BURO VOOR SYSTEEMONTWIKKELING B.V., (1192620) , Kon . wi 1 hel mi nal aan 3, 
P.O. Box 8348, NL-3503 RH Utrecht, (NL) , (applicant designated states: 
AT ; BE ; CH ; DE ; DK ; ES ; FR ; GB ; GR ; IT ; LI ; LU ; NL ; SE) 
INVENTOR: 

Sadler, victor, Li vi ngstonelaan 304, NL-3526 HW Utrecht, (NL) 
LEGAL REPRESENTATIVE: 

de Bruijn, Leendert C. et al (19641), Nederlandsch Octrooi bureau 
Scheveningseweg 82 P.O. Box 29720, NL-2502 LS ' s-Gravenhage , (NL) 
PATENT (CC, No, Kind, Date): EP 386825 Al 900912 (Basic) 
APPLICATION (CC, No, Date): EP 90200462 900226; 
PRIORITY (CC, No, Date): NL 89587 890310 

DESIGNATED STATES: AT; BE; CH ; DE; DK ; ES ; FR ; GB ; GR ; IT; LI; LU; NL; SE 
INTERNATIONAL PATENT CLASS (V7) : G06F-015/38; 
ABSTRACT WORD COUNT: 217 

LANGUAGE (Publ i cati on , Procedu ral , Appl i cati on) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language update word Count 

CLAIMS A (English) EPABFl 506 

SPEC A (English) EPABFl 4217 
Total word count - document A 4723 
Total word count - document B 0 
Total word count - documents A + B 4723 

...ABSTRACT semantically related to each other, comprising the following 
steps: 

a) the retrieval from the said text corpus of a set of sentences 
in which one or more of the given two or more lexical items... 

...SPECIFICATION context makes it possible to find meaningful similarities 
in the contextual patterns of semantically related words such as, in 
the present example, the words discard and REMOVE. 

Even with the limited number of sentences used in this example, a 
number of common contextual elements already appear, if the whole... 
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Method and apparatus for producing an abstract of a document 
verfahren und vorrichtung zur Herstellung einer Zusammenfassung eines 
Dokumentes 

Methode et dispositif pour produire un abrege d'un document 

PATENT ASSIGNEE: 

KABUSHIKI KAISHA TOSHIBA, (213130), 72, Hori kawa-cho , Saiwai-ku, 
Kawasaki-shi , Kanagawa-ken 210, Op), (applicant designated states: 
de;fr;GB) 
inventor: 

Doi , Miwako, (L-208) 30-1 Hisamoto Takatsu-ku, Kawasaki-shi Kanagawa-ken, 
OP) 

LEGAL REPRESENTATIVE: 

Lehn, Werner, Dipl.-lng. et al (7471), Hoffmann Eitle, Patent- und 
Rechtsanwalte, Postfach 81 04 20, 81904 Munchen, (DE) 
PATENT (CC, No, Kind, Date): EP 361464 A2 900404 (Basic) 
EP 361464 A3 920902 
EP 361464 Bl 980812 
APPLICATION (CC, No, Date): EP 89117915 890928; 
PRIORITY (CC, No, Date): 3 P 88245967 880930 
DESIGNATED STATES: DE ; FR; GB 

INTERNATIONAL PATENT CLASS (V7) : G06F-017/24 ; G06F-017/30; 
ABSTRACT WORD COUNT: 112 



LANGUAGE (Publ i cati on , Procedu ral , Appl i cati on) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language update word Count 

CLAIMS B (English) 9833 751 

CLAIMS B (German) 9833 714 

CLAIMS B (French) 9833 816 

SPEC B (English) 9833 3502 
Total word count - document A 0 
Total word count - document B 5783 
Total word count - documents A + B 5783 

...SPECIFICATION by this method. Moreover, the method has a drawback that, 
as the sentences with frequently appearing words are to be extracted, 
the number of sentences to be extracted also tends to become 
numerous, while a concise abstract is more desirable... 
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MEDIA CONTENT ASSESSMENT AND CONTROL SYSTEMS 

EVALUATION DE CONTENU MEDIA ET SYSTEMES DE GESTION 

Patent Applicant/Assignee: 

WAGGENER EDSTROM WORLDWIDE INC, Three Centerpointe Drive Suite 300, Lake 
Oswego, Oregon 97035, US, US (Residence), US (Nationality), (For all 
designated states except: US) 
Patent Applicant/inventor: 

GALLAGHER Daniel, 152 Se Spokane Street #6, Portland, Oregon 97202, US, 

US (Residence), US (Nationality), (Designated only for: US) 
LIN Dia, 1545 Ne 96th Street, Seattle, Washington 98115, US, US 

(Residence), CN (Nationality), (Designated only for: US) 
STOFFREGEN Marc, 17217 Sw Sandhill Lane, Sherwood, Oregon 97140, US, US 
(Residence), US (Nationality), (Designated only for: us) 
Legal Representative: 

BROOKS Michael Blaine (agent), 1445 East Los Angeles Ave Suite 301z, Simi 
valley, California 93065, US 
Patent and Priority information (Country, Number, Date): 
Patent: WO 200828070 A2 20080306 (WO 0828070) 

Application: WO 2007US77286 20070830 (PCT/WO US2007077286) 

Priority Application: US 2006824111 20060831; US 2007846866 20070829 
Designated States: 

(All protection types applied unless otherwise stated - for applications 
2004+) 

AE AG AL AM AT AU AZ BA BB BG BH BR BW BY BZ CA CH CN CO CR CU CZ DE DK 
DM DO DZ EC EE EG ES FI GB GD GE GH GM GT HN HR HU ID IL IN IS 3 P KE KG 
KM KN KP KR KZ LA LC LK LR LS LT LU LY MA MD ME MG MK MN MW MX MY MZ NA 
NG NI NO NZ OM PG PH PL PT RO RS RU SC SD SE SG SK SL SM SV SY TD TM TN 
TR TT TZ UA UG US UZ VC VN ZA ZM ZW 

(EP) AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC MT 
NL PL PT RO SE SI SK TR 

(OA) BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG 
AP) BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW 
EA) AM AZ BY KG KZ MD RU T3 TM 
Publication Language: English 
Filing Language: English 
Fulltext word Count: 10824 

Full text Availability: 
Detailed Description 

Detailed Description 

... word or phrase that is repeated. This example counts word frequency 
(wf) 4 11 for word occurrence, and for co - occurrence , counts 
sentence frequency (SF) 412 and paragraph frequency (PF) 413. Proximity 
counts, such as within three words , or phrase counts and co 
occurrences of phrase counts , e.g., sentence , paragraph, within 
specified word proximity, may also Be included. The exemplary method may 
then rank. . . 
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METHOD FOR SCORING CHANGES TO A WEBPAGE 

PROCEDE POUR MARQUER LES CHANGEMENTS APPORTES A UNE PAGE WEB 

Patent Applicant/Assignee: 

MONITOR110 INC, 58 East 11th Street, 3rd Floor, New York, New York 10003, 
US, US (Residence), US (Nationality), (For all designated states 
except: US) 
Patent Applicant/inventor: 

STEWART Jeffrey A, 29 Great Jones, PHW, New York, New York 10012, US, US 

(Residence), US (Nationality), (Designated only for: US) 
AHMAD Shera, 98-01 67th Ave #9b, Rego Park, New York 11374, US, US 
(Residence), US (Nationality), (Designated only for: US) 
Legal Representative: 

FERRARA Richard P (agent), Fish & Richardson P.C., P.O. Box 1022, 
Minneapolis, Minnesota 55440-1022, US 
Patent and Priority information (Country, Number, Date): 

Patent: WO 2007140364 A2 20071206 (WO 07140364) 

Application: WO 2007US69880 20070529 (PCT/WO US2007069880) 

Priority Application: US 2006808574 20060526; US 2007892945 20070305 
Designated States: 

(All protection types applied unless otherwise stated - for applications 
2004+) 

AE AG AL AM AT AU AZ BA BB BG BH BR BW BY BZ CA CH CN CO CR CU CZ DE DK 
DM DO DZ EC EE EG ES FI GB GD GE GH GM GT HN HR HU ID IL IN IS IIP KE KG 
KM KN KP KR KZ LA LC LK LR LS LT LU LY MA MD ME MG MK MN MW MX MY MZ NA 
NG NI NO NZ OM PG PH PL PT RO RS RU SC SD SE SG SK SL SM SV SY TJ TM TN 
TR TT TZ UA UG US UZ VC VN ZA ZM ZW 

(EP) AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC MT 
NL PL PT RO SE SI SK TR 
OA) BF BD CF CG CI CM GA GN GQ GW ML MR NE SN TD TG 
AP) BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW 
(EA) AM AZ BY KG KZ MD RU TJ TM 
Publication Language: English 
Filing Language: English 
Fulltext Word Count: 6224 

Full text Availability: 
Detailed Description 

Detailed Description 

look to the text leading up to the third paragraph to see if any 
predetermined keywords appear . The calculator may look to a preset 
number of characters, sentences , paragraphs or the like leading to the 
changed content to perform keyword analysis 430. For... 
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METHOD FOR SEARCHING PATENT DOCUMENT BY APPLYING DEGREE OF SIMILARITY AND 
SYSTEM THEREOF 

PROCEDE DE RECHERCHE D'UN DOCUMENT DE BREVET PAR APPLICATION D'UN DEGRE DE 
SIMILITUDE ET SYSTEME ASSOCIE 

Patent Applicant/inventor: 

KIM Jeong-Jin, 102-603 Cheonggu Apt., Hongje4-dong , Seodaemun-gu , Seoul 
120-786, KR, KR (Residence), KR (Nationality), (Designated for all) 
Legal Representative: 

PARK Young-woo (agent), 5F. , Seil Building, #727-13, Yoksam-dong, 
Gangnam-gu, Seoul 135-921, KR 
Patent and Priority information (Country, Number, Date): 

Patent: WO 200752883 Al 20070510 (WO 0752883) 

Application: WO 2006KR3125 20060809 (PCT/WO KR2006003125) 

Priority Application: KR 1020050104402 20051102 
Designated States: 

(All protection types applied unless otherwise stated - for applications 



2004+) 

AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM 
DZ EC EE EG ES FI GB GD GE GH GM HN HR HU ID IL IN IS 3 P KE KG KM KN KP 
KZ LA LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM 
PG PH PL PT RO RS RU SC SD SE SG SK SL SM SY TD TM TN TR TT TZ UA UG US 

UZ VC VN ZA ZM ZW 

(EP) AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL 
PL PT RO SE SI SK TR 

(OA) BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG 

(AP) BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW 

(EA) AM AZ BY KG KZ MD RU TJ TM 
Publication Language: English 
Filing Language: Korean 
Fulltext word Count: 9176 

Full text Availability: 
Detailed Description 

Detailed Description 

of "appearance frequency" of the patent document may be evaluated to 

be. 

The "weight of number of sentences " indicates in how many sentences 
the keywords are found with respect to the number of sentences of 
the searched document. An amount of content of the searched document may 
be large. . . 

...keywords appear in each of the documents, it is evaluated that the 
document in which keywords appear once in three sentences has a 
higher "weight of number of sentences ", as well as a higher weight of 
"appearance frequency". 

As described above, when the additional... 

...high when the keyword pair exists in the same sentence, and the higher 

the distance ( number of sentences ) between the keywords in the 

keyword pair is found to be, the lower the priority may become. In 

addition, the priority value may be... 
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GENERATING CHINESE LANGUAGE COUPLETS 
GENERATION DE COUPLETS EN CHINOIS 

Patent Applicant/Assignee: 

MICROSOFT CORPORATION, One Microsoft way, Redmond, Washington 98052-6399, 

US, US (Residence), US (Nationality), (For all designated states 

except: US) 
inventor(s) : 

ZHOU Ming, One Microsoft way, Redmond, Washington 98052-6399, US, 

(Designated for all) 
SHUM Heung-Yeung, One Microsoft way, Redmond, Washington 98052-6399, US, 
(Designated for all) 
Patent and Priority information (Country, Number, Date): 

Patent: WO 200705884 A2-A3 20070111 (WO 0705884) 

Application: WO 2006US26064 20060703 (PCT/WO US2006026064) 

Priority Application: US 2005173892 20050701 
Designated States: 

(All protection types applied unless otherwise stated - for applications 
2004+) 

AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM 
DZ EC EE EG ES FI GB GD GE GH GM HN HR HU ID IL IN IS DP KE KG KM KN KP 
KR KZ LA LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ 
OM PG PH PL PT RO RS RU SC SD SE SG SK SL SM SY TD TM TN TR TT TZ UA UG 

US UZ VC VN ZA ZM ZW 

(EP) AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL 
PL PT RO SE SI SK TR 

(OA) BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG 
AP) BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW 
EA) AM AZ BY KG KZ MD RU TD TM 



Publication Language: English 
Filing Language: English 
Fulltext word Count: 6725 



Full text Availability: 
Detailed Description 
Cl ai tns 

Detailed Description 

be calculated using Equation 4 and x=El/81, where El is the number of 
words appearing only once corresponding to b, and s is the total 
number of words in first scroll sentences of the training corpus 
corresponding to b, in the training data. 
(2) For first scroll . . . 

Cl ai m 

the couplet corpus, wherein the sentence counts comprise number of 
sentences having a word x, number of sentences having a word y, and 
number of sentences having a co - occurrence of word x and word 

y- 

5. The computer readable medium of claim 3, and further comprising 
constructing a Hidden... 
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COMPARING TEXT BASED DOCUMENTS 
COMPARAISON DE DOCUMENTS BASES SUR UN TEXTE 

Patent Applicant/Assignee: 

CURTIN UNIVERSITY OF TECHNOLOGY, Kent Street, Bentley, western Australia 
6102, AU, AU (Residence), AU (Nationality), (For all designated states 
except: US) 
Patent Applicant/inventor: 
WILLIAMS Robert Francis, 5 Roche Court, Bull Creek, western Australia 

6149, AU, AU (Residence), AU (Nationality), 
DREHER Heinz, 195 Homestead Road, Mahogany Creek, western Australia 6072, 
AU, AU (Residence), AU (Nationality), 
Legal Representative: 

GRIFFITH HACK (agent), Level 19, 109 St. Georges Terrace, Perth, western 
Australia 6000, AU 
Patent and Priority information (Country, Number, Date): 

Patent: WO 2006119578 Al 20061116 (WO 06119578) 

Application: WO 2006AU630 20060512 (PCT/WO AU2006000630) 

Priority Application: AU 2005902424 20050513; AU 2005903032 20050610 
Designated States: 

(All protection types applied unless otherwise stated - for applications 
2004+) 

AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM 
DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS IIP KE KG KM KN KP KR 
KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG 
PH PL PT RO RU SC SD SE SG SK SL SM SY T3 TM TN TR TT TZ UA UG US UZ VC 
VN YU ZA ZM ZW 

(EP) AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL 
PL PT RO SE SI SK TR 
OA) BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG 
AP) BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW 
(EA) AM AZ BY KG KZ MD RU T3 TM 
Publication Language: English 
Filing Language: English 
Fulltext word Count: 11078 

Fulltext Availability: 
Detailed Description 

Detailed Description 

... which words appear in the student essay; NoModel Concepts is the number 
of concepts for which words appear in the model essay; NoSentences is 
the number of sentences in the student essay; Nowords is the number 



of words in the student essay; NonConceptualisedwordSRatio. 
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WORD ASSOCIATION METHOD AND APPARATUS 

PROCEDE ET APPAREIL D 'ASSOCIATION DE MOTS 

Patent Applicant/inventor: 

ABIR Eli, 910 Route 35, Cross River, NY 10518, US, US (Residence), IL 
(Nationality) 

Legal Representative: 

SONGER Michael l (et a"l) (agent), Arnold & Porter, 555 Twelfth Street, 
NW, Washington, DC 20004-1206, US, 

Patent and Priority information (Country, Number, Date): 

Patent: WO 2003102812 Al 20031211 (WO 03102812) 

Application: WO 2003US2516 20030129 (PCT/WO US0302516) 

Priority Application: US 2002157894 20020531; US 2002281997 20021029 

Designated States: 

(Protection type is "patent" unless otherwise stated - for applications 
prior to 2004) 

AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ 
EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS IIP KE KG KP KR KZ LC LK LR 
LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SC SD SE SG 
SK SL TD TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW 

(EP) AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT SE SI 
SK TR 

(OA) BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG 
AP) GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW 
EA) AM AZ BY KG KZ MD RU T3 TM 
Publication Language: English 
Filing Language: English 
Fulltext word Count: 27708 

Full text Availability: 
Detailed Description 

English Abstract 

...or near equivalent semantically. One method for associating words and 
word strings includes querying a collection of documents with a 
user-supplied word or word string input device 210), determining a 
user-defined. . . 

Detailed Description 
string) . 



Any combination of recurring patterns of words and word strings based on 
the number of sentences in the database in which the word "nets" 
appears 3 words before "go to the game" when "tickets" appears 9 words 
after "go to the game... 
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ELECTRONIC DOCUMENT INDEXING SYSTEM AND METHOD 

SYSTEME ET PROCEDE D'INDEXAGE DE DOCUMENTS ELECTRONIQUES 

Patent Applicant/Assignee: 

HYPERBOLEX LIMITED, Level 2, 19 Tory Street, Wellington, NZ, NZ 

(Residence), NZ (Nationality), (For all designated states except: US) 
Patent Applicant/inventor: 

ANDERSON Roy Edward, 73 Donald Street, Karori , Wellington, NZ, NZ 
(Residence), NZ (Nationality), (Designated only for: US) 
Legal Representative: 

ADAMS Matthew D (et al) (agent), A 3 Park, Huddart Parker Building, 6th 
Floor, P.O. Box 949, Wellington 6015, NZ, 
Patent and Priority information (Country, Number, Date): 



Patent: WO 200394044 Al 20031113 (WO 0394044) 

Application: WO 2003NZ82 20030505 (PCT/WO NZ0300082) 

Priority Application: NZ 518744 20020503 
Designated States: 

(Protection type is "patent" unless otherwise stated - for applications 
prior to 2004) 

AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ 
EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS DP KE KG KP KR KZ LC LK LR 
LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PH PL PT RO RU SC SD SE 
SG SK SL T3 TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW 

(EP) AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE 
SI SK TR 

(OA) BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG 

(AP) GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW 

(EA) AM AZ BY KG KZ MD RU TJ TM 
Publication Language: English 
Filing Language: English 
Fulltext word Count: 4674 

Full text Availability: 
Detailed Description 

Detailed Description 

meet simple to complex lexical criteria including Boolean expressions. 
A typical expression could be to " find all sentences having words 
with the stem "weight" in combination with any of identify, count , 
sentence , document". 

The collation of word use objects into a set of output sentences can be 
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CONTENT CONVERSION METHOD AND APPARATUS 
PROCEDE ET APPAREIL DE CONVERSION DE CONTENU 

Patent Applicant/inventor: 

ABIR Eli, 910 Route 35, Cross River, NY 10518, US, US (Residence), US 
(Nationality) 
Legal Representative: 

SONGER Michael 3 (et al) (agent), Arnold & Porter, 555 Twelfth Street, 
N.W., Washington, DC 20004-1206, US, 
Patent and Priority information (Country, Number, Date): 

Patent: WO 200358374 A2-A3 20030717 (WO 0358374) 

Application: WO 2002US29488 20020918 (PCT/WO US02029488) 

Priority Application: US 200124473 20011221; US 2002157894 20020531 
Designated States: 

(Protection type is "patent" unless otherwise stated - for applications 
prior to 2004) 

AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ 
EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS IP KE KG KP KR KZ LC LK LR 
LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI 
SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW 

(EP) AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LU MC NL PT SE SK TR 
OA) BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG 
AP) GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW 
(EA) AM AZ BY KG KZ MD RU T3 TM 
Publication Language: English 
Filing Language: English 
Fulltext word Count: 19291 

Fulltext Availability: 
Detailed Description 

English Abstract 

...determining the association between words in a language (Fig. 3). The 
method includes providing a collection of documents (306), selecting 
a first word or word strings, and a second word or word string... 



French Abstract 

...des associations entre des mots dans une langue. Le procede selon 
1" invention consiste a collecter des documents , et a choisir un 
premier mot ou suite de mots et un deuxieme mot ou . . . 

Detailed Description 
. . . word string) . 

Any combination of recurring patterns of words and word strings based on 
the number of sentences in the database in which the word "jets" 
appears 3 words before "go to the game" when "tickets" appears 9 words 
after "go to the game... 
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METHOD AND SYSTEM FOR RETRIEVING RELEVANT DOCUMENTS FROM A DATABASE 
PROCEDE ET SYSTEME POUR L' EXTRACTION DE DOCUMENTS PERTINENTS D' UNE BASE DE 
DONNEES 

Patent Applicant/Assignee: 

KAUFMAN CONSULTING SERVICES LTD, 

KAUFMAN Ilia, 
inventor(s) : 

KAUFMAN Ilia, 

Patent and Priority information (Country, Number, Date): 
Patent: WO 9964964 Al 19991216 

Application: WO 99CA531 19990607 (PCT/WO CA9900531) 

Priority Application: US 9888483 19980608 

Designated states: 

(Protection type is "patent" unless otherwise stated - for applications 
prior to 2004) 

AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES Fl GB GD GE 
GH GM HR HU ID IL IN IS DP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK 
MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN 
YU ZA ZW GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU T3 TM AT BE 
CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BD CF CG CI CM GA GN 
GW ML MR NE SN TD TG 

Publication Language: English 

Fulltext word Count: 9941 

Fulltext Availability: 
Detailed Description 

Detailed Description 

if t is a derivative query - word 
where the sum is over only those query- words that are present in the 
particular sentence Si. 

From these quantities , the sentence quantizer 60 calculates a 
position-independent sentence similarity using the following formula. 



(4) Similar'tYd. 
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Summarization apparatus and method 
vorrichtung und verfahren zur Zusammenfassung 
Dispositif et procede pour faire des resumes 

PATENT ASSIGNEE: 

FUJITSU LIMITED, (211463), 1-1, Kami kodanaka 4-chome, Nakahara-ku, 
Kawasaki-shi , Kanagawa 211-8588, (DP), (Applicant designated States: 
all) 
INVENTOR: 

Nakao, Yoshio, Fujitsu Ltd., 4-1-1, Kami kodanaka , Nakahara-ku, 
Kawasaki-shi, Kanagawa 211-8588, (DP) 
LEGAL REPRESENTATIVE: 

Stebbing, Timothy Charles et al (59643), Haseltine Lake, imperial House, 
15-19 Kingsway, London WC2B 6ud, (GB) 
PATENT (CC, No, Kind, Date): EP 1338983 A2 
EP 1338983 A3 
APPLICATION (CC, No, Date): EP 2003008037 980116; 
PRIORITY (CC, No, Date): DP 976777 970117 
DESIGNATED STATES: DE; FR; GB 
RELATED PARENT NUMBER(S) - PN (AN) : 

EP 855660 (EP 98300322) 
INTERNATIONAL PATENT CLASS (V7) : G06F-017/30 
ABSTRACT WORD COUNT: 154 
NOTE: 

Figure number on first page: 2 
LANGUAGE (Publ i cati on , Procedu ral , Appl i cati on) : English; English; English 

FULLTEXT AVAILABILITY: 

Available Text Language update word Count 

CLAIMS A (English) 200335 1478 

SPEC A (English) 200335 21974 
Total word count - document A 23452 
Total word count - document B 0 
Total word count - documents A + B 23452 

...SPECIFICATION of the document are output by adding a blank extraction 
unit with its appearance position set at the end of the document in 
step S73 and removing the unit in step S83 . The description of the added 

...is interested, etc. are stored in the user's preference 16. It also can 
store keywords frequently appearing in such a document, the keywords 
and question sentences often used in retrieval by a user, etc. 
The user's knowledge 17 stores information... 

. . .the users. 

The document access log 18 accumulates the history of user's access to 
documents and summaries. 

The input document ( group ) 19 basically stores a document to be 
summarized, and normally can be generated as any type of electronic 
document. Practically... 
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02044267 

A method and system for the orchestration of tasks on consumer electronics 
verfahren und System zum Steuern von Aufgaben in unterhaltungselektronik 
Procede et systeme pour orchestrer des taches en electronique de loisir 

PATENT ASSIGNEE: 

Samsung Electronics Co., Ltd., (7095030), 416 Maetan-Dong Yeongtong-Gu , 
Suwon-si , Gyeonggi-Do, (KR) , (Applicant designated States: all) 
INVENTOR: 

Messer, Alan, 225 Calle Marguerita, Los Gatos, California 95032, (US) 
Kunjithapatham, Anugeetha, 243 Buena vista Ave. Apt. 702, 



Sunnyvale, Cain form' a 94086, (US) 
LEGAL REPRESENTATIVE: 

waddington, Richard et al (93232), Appleyard Lees, 15 Clare Road, Halifax 
HXl 2HY, (GB) 

PATENT (CC, No, Kind, Date): EP 1647884 A2 060419 (Basic) 
APPLICATION (CC, No, Date): EP 2005255590 050913; 
PRIORITY (CC, No, Date): US 948399 040922 

DESIGNATED STATES: AT; BE; BG ; CH; CY; CZ ; DE; DK; EE; ES; Fl; fr; GB; GR; 
hu; ie; is; it; li ; lt; lu; lv; mc; nl; pl; pt; ro; SE; si; SK; tr 

EXTENDED DESIGNATED STATES: AL ; BA ; HR; MK ; YU 
INTERNATIONAL CLASSIFICATION (V8 + ATTRIBUTES): 
IPC + Level value Position Status version Action Source Office: 

G06F-0009/44 A I F B 20060101 20060223 H EP 
ABSTRACT WORD COUNT: 149 
NOTE: 

Figure number on first page: 2 

LANGUAGE (Publ i cati on , Procedu ral , Appl i cati on) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language update word Count 

CLAIMS A (English) 200616 1703 

SPEC A (English) 200616 6691 
Total word count - document A 8394 
Total word count - document B 0 
Total word count - documents A + B 8394 

...SPECIFICATION selected/requested task suggestions. 

For example, as noted task suggestions can be described as pseudo- 
sentences comprising a set of elements / terms that modify one 
another. 

The present invention allows describing user tasks in an incremental 
and flexible way using pseudo-sentences which... 
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A method and system for presenting user tasks for the control of electronic 
devices 

Methode und vorrichtung zur Darstellung von Benutzeranwendungsfalle zur 

Steuerung von elektronischen Geraten 
Methode et dispositif pour presenter des taches utilisateurs pour commander 

des appareils electroniques 

PATENT ASSIGNEE: 

Samsung Electronics Co., Ltd., (7095030), 416 Maetan-Dong Yeongtong-Gu , 
Suwon-si , Gyeonggi-Do, (KR) , (Applicant designated States: all) 
INVENTOR: 

Messer, Alan, 225 Calle Marguerita, Los GatosCalifornia 95032, (US) 
Kunjithapatham, Anugeetha, 342 Buena vista Avenue Apt., 702, 

SunnyvaleCalifornia 94086, (US) 
LEGAL REPRESENTATIVE: 

waddington, Richard et al (93232), Appleyard Lees, 15 Clare Road, Halifax 

HXl 2HY, (GB) 

PATENT (CC, No, Kind, Date): EP 1640839 Al 060329 (Basic) 
APPLICATION (CC, No, Date): EP 2005255717 050915; 
PRIORITY (CC, No, Date): US 947774 040922 

DESIGNATED STATES: AT; BE; BG ; CH ; CY; CZ ; DE; DK; EE; ES; Fi; fr; GB; GR; 
hu; ie; is; it; li ; lt; lu; lv; MC; nl; pl; pt; ro; SE; si; SK; tr 

EXTENDED DESIGNATED STATES: AL ; BA ; HR ; MK ; YU 
INTERNATIONAL CLASSIFICATION (V8 + ATTRIBUTES): 
IPC + Level value Position Status version Action Source Office: 

G05B-0019/418 A I F B 20060101 20051119 H EP 
ABSTRACT WORD COUNT: 149 
NOTE: 

Figure number on first page: 2 

LANGUAGE (Publ i cati on , Procedu ral , Appl i cati on) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language update Word Count 



CLAIMS A (English) 200613 1075 

SPEC A (English) 200613 6691 

Total word count - document A 7766 

Total word count - document B 0 

Total word count - documents A + B 7766 

...SPECIFICATION selected/requested task suggestions. 

For example, as noted task suggestions can be described as pseudo- 
sentences comprising a set of elements / terms that modify one 
another. 

The present invention allows describing user tasks in an incremental 
and flexible way using pseudo-sentences which... 
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02036182 

A method and system for describing consumer electronics using separate task 

and device descriptions 
Methode und vorrichtung zur Beschrei bung von Haushaltselektronik unter 

verwendung von separaten Aufgaben- und Geratebeschreibungen 
Methode et dispositif pour decrire des produits electroniques en utilisant 

des descriptions de tache et des fonction separes 

PATENT ASSIGNEE: 

Samsung Electronics Co., Ltd., (7095030), 416 Maetan-Dong Yeongtong-Gu , 
Suwon-si , Gyeonggi-Do, (KR) , (Proprietor designated states: all) 
INVENTOR: 

Messer, Alan, 225 Calle Marguerita, Los Gatos California 95032, (US) 
Kunjithapatham, Anugeetha, 243 Buena Vista Ave., Apt., 702,, 

Sunnyvale, California 94086, (US) 
LEGAL REPRESENTATIVE: 
waddington, Richard et al (93232), Appleyard Lees, 15 Clare Road, Halifax 

HXl 2HY, (GB) 

PATENT (CC, No, Kind, Date): ep 1640838 Al 060329 (Basic) 

EP 1640838 Bl 071017 
APPLICATION (CC, No, Date): EP 2005255716 050915; 
PRIORITY (CC, No, Date): US 950121 040924 

DESIGNATED STATES: AT; BE; BG ; CH ; CY ; CZ; DE; DK ; EE; ES ; Fl; fr; GB; GR; 
hu; IE; is; it; li; lt; lu; lv; MC; nl; pl; pt; ro; Se; si; SK; tr 

EXTENDED DESIGNATED STATES: AL ; BA ; HR ; MK ; YU 
INTERNATIONAL CLASSIFICATION (V8 + ATTRIBUTES) : 
IPC + Level value Position Status version Action Source Office: 

G05B-0019/418 A I F B 20060101 20051119 H EP 
ABSTRACT WORD COUNT: 149 
NOTE: 

Figure number on first page: 2 

LANGUAGE (Publ i cati on , Procedu ral , Appl i cati on) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language update word Count 
CLAIMS A (English) 200613 1073 
CLAIMS B (English) 200742 1125 
CLAIMS B (German) 200742 1058 
CLAIMS B (French) 200742 1380 
SPEC A (English) 200613 7030 
SPEC B (English) 200742 6847 
Total word count - document A 8104 
Total word count - document B 10410 
Total word count - documents A + B 18514 
...SPECIFICATION sel ected/requested task suggestions. 

For example, as noted task suggestions can be described as pseudo- 
sentences comprising a set of elements / terms that modify one 
another. 

The present invention allows describing user tasks in an incremental 
and flexible way using pseudo-sentences which... 

...SPECIFICATION sel ected/requested task suggestions. 

For example, as noted task suggestions can be described as pseudo- 
sentences comprising a set of elements / terms that modify one 
another . 



The present invention allows describing user tasks in an incremental 
and flexible way using pseudo-sentences which... 
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Method and apparatus for classification of relative position of one or more 

text messages in an email thread 
Methode und Apparat zur Klassifikation von relativen Positionen einer oder 

mehrerer Textnachrichten in einem Email thread 
Procede et dispositif pour classification de la position relative d'un ou 

plusieurs messages textes dans un thread des courriers electroniques 

PATENT ASSIGNEE: 

Avaya Technology Corp., (3148500), 211 Mount Airy Road, Basking Ridge, Nil 
07920, (US), (Applicant designated States: all) 
INVENTOR: 

Bagga, Amit, 1054 Shadowlawn Drive, Green Brook, NJ 08812, (US) 
Nenkova, Ani N.,c/o Michele Banko, 302 18th Ave. East, Seattle, WA 98102, 
(US) 

LEGAL REPRESENTATIVE: 

Williams, David John et al (86433), Page white & Farrer Bedford House 
John Street, London, WClN 2BF, (GB) 
PATENT (CC, No, Kind, Date): EP 1591925 A2 051102 (Basic) 

EP 1591925 A3 070620 
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...SPECIFICATION Generally, the present invention recognizes that emails 
that report problems or pose questions (most probably root messages ) 
will be characterized by different punctuation than messages that contain 
answers or solutions. 

v. Length of Email Message 

The length of an email message, for example, in terms of the number 
of sentences can also be used as a feature. The length of an email 
message can be . . . 

...root versus non-root word list 240 can be based on an examination of a 
set of root and non- root messages . Two dictionaries can be 
constructed with a first dictionary listing words typically occurring in 
non- root messages and another dictionary listing words typically 
occurring in root messages . The occurrence numbers can optionally be 
tested for statistical significance with the binomial test and... 

...versus non-root classification task. In an exemplary implementation, the 
list of words typical for root messages was very short, while the 
list of words typical for non-root messages consisted of... 
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...SPECIFICATION title if such a part is detected. A third method is to 
extract a predetermined number of sentences or words described at the 
beginning of a document and employ the extracted sentence or words as 
a title. The first, second, and thi rd ... di splayed , and a categorization 
result outputting unit 94 for outputting the categorization result 
including the cluster -mergi ng-process information . 

The clustering unit 91 includes a document storage unit 911, a 
sentence analyzer 912, a feature element extractor 913, a feature table 

...title if such a part is detected. A third method is to extract a 
predetermined number of sentences or words located at the beginning 
of a document and employ the extracted sentence or words as a title. 
The first, second, and third... 

...of the feature table and categorizes the documents Dl, D2 D7 into a 

plurality of clusters according to semantic similarity. Documents 
including a common feature element are detected on the basis of the 
feature elements included... 
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...SPECIFICATION No. 7-36896 "Method and Apparatus for Generating Digest" 
extracts major expressions (word, etc.) as seed from a document based 
on the complexity of an expression (length of a word, etc.) and generates 
a. . . 

...Application Laid-open No. 8-297677 "Method of Automatically Generating 
Digest of Topics" detects "topical terms " based on the appearance 
frequency of words in a document and generates a summary by extracting 
sentences containing many major "topical terms". 
The second method judges the (relative) importance of sentences based 
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...SPECIFICATION Figures 5, 6 and 7. within the inner scanning loop, having 

selected a particular word group element WG(S,k,i), and having 
established that it is not null, remaining word groups... 

...those with a higher value of k within the same sentence and those in 
later sentences only, are checked for matching word groups. For each 
match found , the weighting of word group WG(S,k,i) is incremented and 
the matching word group is set to... 

23/3.K/10 (Item 10 from file: 348) 

DIALOG (R) File 348: EUROPEAN PATENTS 

(c) 2008 European Patent Office. All rts. reserv. 

00672789 

Dictionary creation supporting system 
unterstutzungssystem zur Herstellung von worterbuchern 
Systeme de support pour la creation de dictionnai res 

PATENT ASSIGNEE: 

KABUSHIKI KAISHA TOSHIBA, (213130), 72, Hori kawa-cho , Saiwai-ku, 
Kawasaki-shi , Kanagawa-ken 210-8572, (DP), (Proprietor designated 
states: all) 
INVENTOR: 

Hirakawa, Hideki , 1-18-24, Kati da-Mi nami , Kohoku-ku, Yokohama-shi , 
Kanagawa-ken, (DP) 

Kumano, Akira, 7-4-401, Nakadai , Higashiterao , Tsurumi-ku, Yokohama-shi, 
Kanagawa-ken, Op) 
LEGAL REPRESENTATIVE: 

Lehn, Werner, Dipl.-lng. et al (7474), Hoffmann Eitle, Patent- und 
Rechtsanwalte, Arabel 1 astrasse 4, 81925 Munchen, (DE) 
PATENT (CC, No, Kind, Date): EP 645720 A2 950329 (Basic) 
EP 645720 A3 951129 
EP 645720 Bl 010801 
APPLICATION (CC, No, Date): EP 94114789 940920; 
PRIORITY (CC, No, Date): 3 P 93232649 930920 
DESIGNATED STATES: DE; FR; GB 

INTERNATIONAL PATENT CLASS (V7) : G06F-017/27; G06F-017/28 

ABSTRACT WORD COUNT: 164 

NOTE: 

Figure number on first page: 1 

LANGUAGE (Publ i cati on , Procedu ral , Appl i cati on) : English; English; English 
FULLTEXT AVAILABILITY: 

Available Text Language update word Count 
CLAIMS A (English) EPAB95 1569 
CLAIMS B (English) 200131 1485 
CLAIMS B (German) 200131 1555 
CLAIMS B (French) 200131 1829 
SPEC A (English) EPAB95 9103 
SPEC B (English) 200131 9012 
Total word count - document A 10674 
13881 
24555 

..SPECIFICATION TEDUN" (Japanese word generically meaning operation 
procedure) are outputted in this order as these composite words appear 

in this order in the original sentences . in this case, the operation 
of the registration word selection processing si proceeds as follows... 

..and the value of its superficial position (a value of "mds") is the 
smallest. This element is set as the element 1. 

(2) The element 1 is deleted from the output information source 

file. 

(3) The element 1. These elements are set as the element 2, 
( sup(. . . .) , element N. 



(4) The element 2, ( sup(....) , element N are deleted from the 
output information source... 

...SPECIFICATION TEDUN" (Japanese word generically meaning operation 
procedure) are outputted in this order as these composite words appear 

in this order in the original sentences . in this case, the operation 
of the registration word selection processing si proceeds as follows... 

...and the value of its superficial position (a value of "mds") is the 
smallest. This element is set as the element 1. 

(2) The element 1 is deleted from the output information source file. 

(3) The... 

...is searched to take out the elements having the same registration 
knowledge information as the element 1. These elements are set as 
the element 2, (center dot) (center dot) (center dot) (center dot) (center 
dot) (center dot) (center dot... 
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. . .ABSTRACT A2 

A computerized method for organizing information retrieval based on the 
content of a set of primary documents . The method generates answer 
hypotheses based on text found in the primary documents and, 
typically, a natural -language input string such as a question. The answer 
hypotheses can . . . 

...A text corpus (12) can be queried to provide verification evidence not 
present in the primary documents . in another aspect the method is 
implemented in the context of a larger two-phase... 

...SPECIFICATION is substituted for a placeholder or placeholders in the 
template . 

6.5 Matching Templates Against Primary Documents 



in step 264 an attempt is made to verify the linguistic relation under 
consideration for the hypothesis under consideration in the context of 
the primary documents . This is done by matching the filled-in 
templates generated in step 263 against the primary documents . in 
other words , sentences in which the hypothesis appears in the 
context of a template are sought in the primary documents . Any such 
sentences found are retained in association with the hypothesis as 
verification evidence for... 

...SPECIFICATION is substituted for a placeholder or placeholders in the 
template. 

6.5 Matching Templates Against Primary Documents 

in step 264 an attempt is made to verify the linguistic relation under 
consideration for the hypothesis under consideration in the context of 
the primary documents . This is done by matching the filled-in 
templates generated in step 263 against the primary documents . in 
other words , sentences in which the hypothesis appears in the 
context of a template are sought in the primary documents . Any such 
sentences found are retained in association with the hypothesis as 
verification evidence for... 
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...ABSTRACT model are combined into a combined score for each intermediate 
target-structure hypothesis. Finally, a set of target- text hypotheses 
is produced by transducing the highest scoring target-structure 
hypotheses into portions of text... 

...SPECIFICATION from several years of the proceedings of the Canadian 
parliament. From these translations, a training data set is chosen 
comprising those pairs for which both the English sentence and the French 



..that abound in the text, a English vocabulary is chosen consisting of 
all of those words that appear at least twice in English sentences 
in the data, and as a French vocabulary is chosen consisting of all 
those words that appear at least twice in French sentences in the 
data. All other words are replaced with a special unknown English wordor 
unknown. . . 
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Detailed Description 

English Abstract 

Methods and systems for syntactically indexing and searching data sets 

to achieve more accurate search results and tor indexing and searching 
data sets using entity tags alone or in combination therewith are 
provided. Example embodiments provide a Syntactic Query Engine ("SQE") 
that parses, indexes, and stores a data set , as well as processes 
natural language queries subsequently submitted against the data set . 
The SQE comprises a Query Preprocessor, a Data Set Preprocessor, a 
Query Builder, a Data Set indexer, an Enhanced Natural Language 
Parser ("ENLP") , a data set repository, and, in some embodiments, a 
user interface. After preprocessing the data set , the SQE parses the 
data set according to a variety of levels of parsing and determines as 
appropriate the entity tags... 

...grammatical roles of each term to generate enhanced data representations 
for each object in the data set . The SQE indexes and stores these 
enhanced data representations in the data set repository, upon 
subsequently receiving a query, the SQE parses the query also using a 
variety of parsing levels and searches the indexed stored data set to 
locate data that contains similar terms used in similar grammatical 
roles and/or with similar entity tag... 

Publication Year: 2004 

Detailed Description 

subject/ verb/preposition/verb modifier/object; and 
noun/noun modifier. 

24 

Such support includes locating sentences in which the designated terms 

appear in the associated designated syntactic or grammatical role, as 
well as locating, when contextual ly appropriate... 



23/3.K/14 (Item 14 from file: 349) 
DIALOG (R) File 349:pct fulltext 
(c) 2008 wiPO/Thomson. All rts. reserv. 

01123033 

METHOD AND SYSTEM FOR USING QUERY INFORMATION TO ENHANCE CATEGORIZATION AND 

NAVIGATION WITHIN THE WHOLE KNOWLEDGE BASE 
PROCEDE ET SYSTEME PERMETTANT D'UTILISER DES INFORMATIONS DE REQUETES POUR 
AMELIORER LA CATEGORISATION ET LA NAVIGATION DANS LA TOTALITE DE LA 
BASE DE CONNAISSANCES 
Patent Applicant/Assignee: 

KENNETH Nadav, 30 Ha-Mazbiim Street, 69935 Tel Aviv, IL, IL (Residence), 

IL (Nationality), (For all designated states except: US) 
MIZRAHI Moshe, 21 Avner Street, 69937 Tel Aviv, IL, IL (Residence), IL 
(Nationality), (For all designated states except: US) 
Patent Applicant/inventor: 

SEBBANE Danny, 18 Adam Hacohen Street, 64585 Tel Aviv, IL, IL (Residence) 
, IL (Nationality) 
Legal Representative: 

NAOMI ASSIA LAW OFFICES (agent), 32 Habarzel Street, Ramat Hachayal , 
69710 Tel Aviv, IL, 
Patent and Priority information (Country, Number, Date): 

Patent: WO 200444896 A2-A3 20040527 (WO 0444896) 

Application: WO 2003IL938 20031110 (PCT/WO IL03000938) 

Priority Application: US 2002425728 20021113 
Designated States: 

(Protection type is "patent" unless otherwise stated - for applications 
prior to 2004) 
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Detailed Description 

English Abstract 

...is disclosed to create some structure from the knowledge base of an 
organization, the knowledge base including a document database (DB) 
and queries submitted by users concerning the documents, wherein the 
method performs monitoring... 
Publication Year: 2004 

Detailed Description 

to each other, as described in step 4 below. Queries are associated 
with phrases (or sentences ) and clusters are associated with 
documents . Thus, words that appear in queries have an added 
component relative to those that only appear in documents. A... 

...documents; and phrases. A word that also appears in queries has a 
4-dimensioal vector: documents ; phrases; clusters ; and queries. A 
vector is used to represent the distribution of the word in the... 
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Detailed Description 



English Abstract 

...OCR) engine and apparatus to perform the method. This exemplary method 
includes segmenting the character data into a set of initial words. 
The set of initial words is word level processed to determine at... 
Publication Year: 2004 

Detailed Description 

no final sentence is selected, but the candidate word sets are 
examined and any candidate words that do not appear in at least one 
of the candidate sentences having the highest POS tri-gram cost are 
removed, if only one candidate word remains... 

...no final sentence is selected, but the candidate word sets are examined 
and any candidate words that do not appear in at least one of the 
candidate sentences having the highest word tri-gram cost are removed, 
if only one candidate word remains... 
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Detailed Description 

in'meaning" is ignored when the goal is to locate or detect a relative 
small set of documents out of a collection comprising millions of 
documents . in the present invention the goal is quite different - the 
variations in wordings are captured .. .airrn ng at directing the users' 
attention to zones where the link set for the constituent sentences 
indicate a bundle of focused words or several co - occurring focused 
words . More specifically, the idea is: when the user selects a document 
for exploration, a text. . .otherwise similar sentences can be notified as 
different . 



The following elements constitute parts of the information in the link 
sets and the words listed in order of appearance in the sentences 



Sentences marked as I and 2 share 4 noun elements, of the 4 noun 
elements are . . . 
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METHOD AND SYSTEM FOR ENHANCED DATA SEARCHING 
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Fulltext Availability: 

Detailed Description 

English Abstract 

Methods and systems for syntactically indexing and searching data sets 

to achieve more accurate search results are provided. Example 
embodiments provide a Syntactic Query Engine ("SQE") that parses, 
indexes, and stores a data set , as well as processes natural language 
queries subsequently submitted against the data set . The SQE 
comprises a Query Preprocessor, a Data Set Preprocessor, a Query 
Builder, a Data Set indexer, an Enhanced Natural Language Parser 
("ENLP"), a data set repository, and, in some embodiments, a user 
interface. After preprocessing the data set , the SQE parses the data 
set and determines the syntactic and grammatical roles of each term to 
generate enhanced data representations for each object in the data set 
. The SQE indexes and stores these enhanced data representations in the 
data set repository, upon subsequently receiving a query, the SQE 
parses the query similarly and searches the indexed stored data set 
to locate data that contains similar terms used in similar grammatical 
roles, in this manner, the SQE is... 
Publication Year: 2003 



Detailed Description 
modifier; 

subject/ verb/preposition/verb modifier/object; and 
noun/noun modifier. 

Such support includes locating sentences in which the designated terms 

appear in the associated designated syntactic or grammatical role, as 
well as locating, when contextual! y appropriate, sentences in which the 
designated terms appear but where the designated roles ...may be 
implemented to recognize any number of 

programmable attributes in natural language queries and data sets 
(described in detail as "preferences" with reference to Figure 15). in 
one embodiment, these attributes... 
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Fulltext Availability: 

Detailed Description 
Publication Year: 2003 

Detailed Description 

conduct elementary morphological words analysis. Commonly, the summary 
was made up from the sentences of initial text that received the 
highest rank, or that met some other criteria. The statistics, in such 
cases, were collected on text word usage rate. That is, the more the 
word was found in the text, the weightier it was considered. Auxiliary 
words and other of a word in a document set was taken into 
consideration. Such estimation is discussed in U.S. 



Patent No. 6, 128. . .values, such as the average number of words and 



symbols in a sentence, the average number of sentences in the 
paragraph, and so on. Then, the topic, of the document is defined on. 
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METHOD AND APPARATUS FOR TRANSFORMING CONTENTS ON THE WEB 

PROCEDE ET APPAREIL PERMETTANT LA TRANSFORMATION DE CONTENUS EN LIGNE 
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Detailed Description 

French Abstract 

...40) sont transformes de maniere appropriee grace a un systeme de 
transformation de contenus (10) base sur les articles d ' i nformation 
des contenus en ligne et les resultats de 1 'analyse semantique et 
conformement . . . 
Publication Year: 2002 

Detailed Description 

thedocument, andmenui nformation, thecreation 
of a summary page, the creation of the lists of keywords, key 
sentences etc. and links to places where the keywords etc. 

appear , and the creation of the hyperlinks among the created 
pages. The web contents are displayed. . .summary , keywords and key 
sentences, the pages which contain 

the lists of the keywords, key sentences etc. and the links to 
the places where the keywords , key sentences etc. appear in the 
document, respectively, and document fragments which are 
obtained by dividing the body of... 
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Detailed Description 
Publication Year: 2002 

Detailed Description 

... write Key = HIVIAC (MK, Server-Subject-Name) 

1. S -> C: Server-Finish 

Same format as Data message, with the contents being the 160-bit value 
SHAI (Server None 11 Client-Nonce... 

...Reuse-MK record to avoid round-trip delays. 

2. C -> S: Client- Finish 

Same formatas Data message , with the contents being the 160-bit 
value SHAl (Client None 11 Server-Nonce). This is encrypted with the 
Client-write key, which is derived from master key. 

3. Both sides confirm that the Finish records have the expected contents, 
and then . . . 
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COMPUTER NETWORK INFORMATION MANAGEMENT SYSTEM AND METHOD 
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Detailed Description 

English Abstract 

...by gathering summary data from the information provider node 
indicative of event changes at the information provider node by 
information collection agents extracting information from the 
information provider node based on the summary data; transmitting the 
extracted information to the server; storing... 

Publication Year: 2001 

Detailed Description 

ii. Create ranked by order of occurrence the most frequent word list 
(MFWL) 

from the words in the rwl 

iii. Find sentences in the document containing the top 3 words in the 
NlFWL 

iv. Store these sentences... 
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(EP) AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE 



(OA) BF BD CF CG CI CM GA GN GW ML MR NE SN TD TG 

(AP) GH GM KE LS MW MZ SD SL SZ TZ UG ZW 

(EA) AM AZ BY KG KZ MD RU TD TM 
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Fulltext Availability: 

Detailed Description 

English Abstract 

...to users by having a publisher, or a multilevel structure of primary 
and secondary publishers, collect information items into at least 
one database for periodic delivery of collections of information 
items to users as personalized information . The collections are 
selected based on user profiles that are refined based on collecting and 
analyzing subjective... 

French Abstract 

. . .d'edition, ou d'une structure d'editeurs primaire et secondai re a 
multiples niveaux, qui collectent des articles d' informations dans au 
moins une base de donnees destinee a fournir periodiquement aux 
utilisateurs. . . 
Publication Year: 2001 

Detailed Description 

source document is still preserved. It is important that text of 
synopsis could not be found by simple removing of some words and 
sentences from original document. 

It should be completely generated by filtering algorithm on the basis of 
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GENERATING PERSONALIZED USER PROFILES FOR UTILIZING THE GENERATED USER 
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PRODUCTION DE PROFILS UTILISATEURS PERSONNALISES , UTILES POUR EXECUTER DES 

RECHERCHES ADAPTATIVES DANS L' INTERNET 
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Patent: . . . 20000727 
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Cl ai tns 
Publication Year: 2000 



data item segment count being representative of a number of identical 
segments in the corresponding data segment group of said at least one 
data segment group, and linking each said data... 

...remote computer system, in an descending order of data item segment 
counts starting from a data segment group having a highest data item 

segment count, and recording said data segment groups and 
corresponding data item segment counts in 
said data item profile; and 

(qq) storing, by the remote computer system... 

...sentence mark is reached before said word count 

reaches a predefined word limit, storing said counted words as a 
sentence , restarting said word count , and repeating said step (rr) 
starting after a last word of said stored sentence ; 
and 

(tt) when said word count reaches said predefined word limit, storing 
said counted words as a sentence , restarting said word count , and 
repeating said step (rr) starting after a last word of said stored 
sentence . 
1. . . 
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Fulltext word Count: 11137 

Patent and Priority information (Country, Number, Date): 

Patent : ... 19971016 

Fulltext Availability: 

Detailed Description 
Publication Year: 1997 

Detailed Description 

to form words and build a sentence, the user 
simply selects buttons containing words or groups of 
advantage that more information can be appended with 
fewer bits of data information and in less time. 

The present invention has advantages in building 
messages, The ability to build meaningful but concise 

sentences is made possible through preprogrammed words 
and phrases found in each syntax Block category. The 
user selects from the list of choices, and presses... 
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Title: Mining product reputations on the web 

Author: Morinaga, Satoshi ; Yamanishi, Kenji; Tateishi, Kenji ; Fukushima, 
Toshi kazu 

Corporate Source: NEC Corporation, Kawasaki, Kanagawa 216-8555, Japan 
Conference Title: KDD - 2002 Proceedings of the Eight ACM SIGKDD 

international Conference on Knowledge Discovery and Data Mining 

Conference Location: Edmonton, Alta, Canada Conference Date: 

20020723-20020726 

Sponsor: SIGKDD; ACM Special interest Group on Knowledge Discovery and 
Data 

E.I. Conference No.: 61746 

Source: Proceedings of the ACM SIGKDD international Conference on 
Knowledge Discovery and Data Mining 2002. p 341-349 
Publication Year: 2002 
Language: English 

Document Type: CA; (Conference Article) Treatment: T; (Theoretical) 
Journal Announcement: 0312W1 

Abstract: Knowing the reputations of your own and/or competitors' 
products is important for marketing and customer relationship management. 
It is, however, very costly to collect and analyze survey data 
manually. This paper presents a new framework for mining product 
reputations on the internet, it automatically collects people's opinions 
about target products from web pages, and it uses text mining techniques 
to obtain the reputations of those products. On the basis of human-test 
samples, we generate in advance syntactic and linguistic rules to 
determine whether any given statement is an opinion or not, as well as 
whether such any opinion is positive or negative in nature, we first 
collect statements regarding target products using a general search 
engine, and then, using the rules, extract opinions from among them and 
attach three labels to each opinion, labels indicating the 
positive/negative determination, the product name itself, and an numerical 
value expressing the degree of system confidence that the statement is, in 
fact, an opinion. The labeled opinions are then input into an opinion 
database. The mining of reputations, i.e., the finding of statistically 
meaningful information included in the database, is then conducted, we 
specify target categories using label values (such as positive opinions of 
product A) and perform four types of text mining: extraction of 1) 
characteristic words, 2) co - occurrence words, 3) typical sentences , 
for individual target categories, and 4) correspondence analysis among 
multiple target categories. Actual marketing data is used to demonstrate 
the validity and effectiveness of the framework, which offers a drastic 
reduction in the overall cost of reputation analysis over that of 
conventional survey approaches and supports the discovery of knowledge 
from the pool of opinions on the web. 27 Refs. 

Descriptors: *Data mining; world wide web; Electronic commerce; 
Competition; Marketing; Customer satisfaction; Syntactics 

identifiers: Product reputations; Marketing data; Opinion labeling 

Classification Codes: 

723.2 (Data Processing); 723.5 (Computer Applications); 911.2 
(industrial Economics); 911.4 (Marketing) 

723 (Computer Software, Data Handling & Applications); 911 (Cost & 
value Engineering; industrial Economics); 912 (industrial Engineering & 
Management) 
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PAGE 1555 . 124 PAGES 
Descriptors: LANGUAGE, LINGUISTICS 
Descriptor Codes: 0290 
ISBN: 0-612-99051-6 

Sentences with multiple preverbs and/or particles are examined in this 
thesis. The data sentences were collected from the first 18 stories of 
the Labrador innu Text Project. Chapter 1 is an introduction to innu-aimun 
grammar, with sections on previous research into word ordering, especially 
preverb ordering. Chapter 2 describes the patterning, use and co - 
occurrence of the ten most common preverbs in the data sentences . 
Preverbs are subdivided into modal preverbs, temporal preverbs, aspectual 
preverbs and other preverbs. Chapter 3 discusses 28 common particles in the 
data. These particles are also divided into smaller groups, including 
complementizers, focus particles, negative particles, adverbs, temporal and 
aspectual particles, particles of speaker opinion and particles with 
changed forms. Both chapters 2 and 3 include discussion of regular patterns 
of ordering of preverbs or particles. Chapter 4 is an analysis of the use 
of the independent or conjunct orders following negative particles. 
Optimal ity Theory is used to explain innu data, and sentences are analyzed 
based on Brittain (2001, 1997). A general thesis conclusion ends chapter 4. 
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This thesis addresses the problems of extracting useful information 
from changes in a data stream including natural language. The dependencies 
within language complicate this analysis. Monitoring all combinations of 
lexical items is not computationally feasible. Even statistical tests upon 
single word occurrences can reveal many apparent differences. Yet the 
individual changes often reflect comparatively few events influencing the 
data. To automatically ascertain the causes of changes in the data stream 
requires methods for finding structure within the set of individual 
changed items . in this work we develop several techniques that extract 
such structure. 

One approach utilizes word associations to cluster detected changes. 
The changing relationships between different lexical items-for 
example, the difference in correlation for word occurrence indicator 
variables-provide a notion of dissimilarity. A clustering algorithm 
with these dissimilarities as input will output groups of words that 
exhibit the same profile of changing co - occurrences with other words. 
This isolates novel sentence patterns. Changes connected to some 
unanticipated event cluster together, thus are readily interpreted. 

Changes can be further explained by attaching them to some subset of 
the data stream. Divisive clustering techniques make this practical; 
similar data entries largely remain together through the clustering 
process. Clustering recursively reduces the complexity of the problem, 
stratifying the full language model into more homogeneous sub-languages. 
Analysis continues on these smaller, more tractable subsets. Comparing 
global to cluster-based tests can distinguish changes in the relative 
frequencies of known utterance types from novel data. 

Explicit conditioning isolates the data containing particular lexical 



items; standard process control tests select those features that alter in 
frequency. This algorithm peels away portions of the data until it detects 
no changes within the remainder, implicit conditioning divides the language 
model so as to maximize the sample probability. This utilizes all lexical 
items in each data entry. 

Such techniques can be combined. Together they provide an analysis 
package suitable for applications such as maintaining quality within an 
automated call center. A machine can call human attention to data that 
exhibits unexpected behavior in time, and help determine the nature of the 
observed change. 
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The principal problem with information management today is organizing 
the ever-widening body of electronically available text. Manual techniques 
for filtering and structuring such information are useful for sifting 
through a collection of texts , but manual approaches cannot keep pace 
with the quantity and variety of text generated. Outside of well -funded 
fields such as law and medicine, there is little availability of any 
techniques other than simple word and stem matching for wading through this 
information. Such string matching techniques are thwarted, however, By the 
language variability problem, in which a similar idea is expressed by a 
variety of different words. 

We defend the thesis that selective Natural Language Processing, 
applying subsets of known language processing techniques, over a 
collection of texts provides enough information to create equivalence 
classes between different terms, thus easing the problem of language 
variability, we present a method using partial syntactic analysis that 
allows creation of equivalence classes over any body of text and we show 
that the classes created by this method are more like manually-created 
classes than those created by document co - occurrence , sentence co - 
occurrence and window-based equivalence class creation techniques. Results 
of applying this method to information retrieval, thesaurus enrichment, and 
creation of automatic thesauri are also presented. 

The main contributions of this thesis are the following, we describe a 
robust domain-independent partial parser for English which yields local 
syntactic contexts of words, we produce a method for using this context to 
create corpus-dependent similarity lists, we demonstrate that the 
similarities extracted by this method correspond to human similarity 
judgments by comparison with psychological data and by showing the overlap 
with manually created thesauri, we demonstrate that the overlap with manual 
thesauri using this syntactic context is greater than that obtained by 
traditional textual windowing techniques, we develop evaluation methods 
applicable to any corpus-based meaning extraction techniques: artificial 
synonyms, and gold standards measurements, we show applications of our 
similarity discovery techniques to information retrieval, thesaurus 
enrichment, and automatic thesaurus construction. 
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Author Affiliation: Dept. of Phys. Sci . , R. Brisbane Hospital, Herston, 
Qld. , Australia 

Journal: Medical Physics vol.23, no. 4 p. 549-55 

Publisher: AIP for American Assoc. Phys. Med, 

Publication Date: April 1996 Country of Publication: USA 
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Material identity Number: M190-96005 
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Language: English Document Type: Journal Paper (JP) 
Treatment: Practical (p) ; Experimental (x) 

Abstract: The positive predictive value of mammography is between 20% and 
25% for clustered mi crocalcifi cations. For very early cancers there is 
often a lack of concordance between mammographic signs and pathology. This 
study examines the usefulness of computer texture analysis to improve the 
accuracy of malignant diagnosis. Texture analysis of the breast tissue 
surrounding mi crocal ci fi cations on digitally acquired images during 
stereotactic biopsy is used in this study to predict malignant vs. benign 
outcomes, 54 biopsy proven cases (36 benign, 18 malignant) are used. The 
texture analysis calculates statistical features from gray level co - 
occurrence matrices and fractal geometry for equal probability and linear 
quantizations of the image data. Discriminant models are generated using 
linear discriminant analysis and logistic discriminant analysis. Results do 
not differ significantly by method of quantization or discriminant 
analysis. Jackknife results misclassify 2 of 18 malignant cases 
(sensitivity 89%) and 6 of 36 benign cases (specificity 83%) for logistic 
discriminant analysis. From this preliminary study, texture analysis 
appears to show significant di sen' minatory power between benign and 
malignant tissue, which may be useful in resolving problems of discordance 
between pathological and mammographic findings, and may ultimately reduce 
the number of benign biopsies. (28 Refs) 
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Abstract: Describes two methods for the acquisition and utilization of 
lexical cooccurrence relationships, under these methods, cooccurrence 
relationships are obtained from two kinds of inputs: example sentences 
and the corresponding correct syntactic structure. The first of the two 



methods treats a set of governors each element of which is bound to a 
element of sister nodes set in a syntactic structure under 
consideration, as a cooccurrence relationship, in the second method, a 
cooccurrence relationship name and affiliated attribute names are manually 
given in the description of augmented rewriting rules. Both methods 
discriminate correctness of cooccurrence by the use of the correct 
syntactic structure mentioned above. Experiment is made for both methods to 
find if thus obtained cooccurrence relationship is useful for the correct 
analysis. (2 Refs) 
Subfile: C 

Descriptors: grammars; rewriting systems 

identifiers: parser; acquisition; utilization; lexical cooccurrence; 
example sentences; correct syntactic structure; cooccurrence relationship 
name; affiliated attribute; rewriting rules 
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Abstract: Thesauri (groupings of related words) are an important 
potential adjunct to information retrieval. Automatically generated 
thesauri have generally been based on statistical analyses of word co - 
occurrence within documents or sentences . Progress in mechanical syntax 
analysis raises the question of how information on the grammatical relation 
between words in a sentence could enhance thesaurus generation. The authors 
have developed a program to use this information , clustering nouns on 
the basis of the verbs with which they occur (as subject or object) and 
verbs on the basis of nouns (and other verbs) with which they occur. This 
program, applied to a small set of transformationally analyzed 
pharmacology texts , has yielded clusters in good agreement with the 
semantic word classes recognized by pharmacologists. These clusters can 
further be used in constructing informational formats for the text. (2 
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Country of Publication: France 

Language: French Summary Language: French; English 

A PARTIR D'UNE PROBLEMATIQUE QUI CONSISTAIT A DEVELOPPER UN SYSTEME 
DOCUMENTAIRE PERMETTANT D 1 INTERROGER EN LANGAGE NATUREL UNE BASE DE 

TEXTES , ON A ETE AMENE A CONCEVOIR ET A IMPLANTER UN PROTOTYPE DE SYSTEME 
DOCUMENTAIRE. EN ACCORD AVEC NOS HYPOTHESES DE TRAVAIL, NOTRE SYSTEME NE 
NECESSITE PAS DE CONNAISSANCE A PRIORI DEPENDANTE D ' UN DOMAIN E . NOUS AVONS 
TENTE DE MONTRER QU'lL EST POSSIBLE, SANS PASSER PAR UNE PHASE DE 
MODELISATION DES CONNAISSANCES , D'EXTRAIRE DES TEXTES UN CERTAIN NOMBRE 
D' INFORMATIONS UTILES DANS LE CADRE DE LA RECHERCHE D ' INFORMATION . POUR 
CELA, NOUS AVONS PRIVILEGIE LES TECHNIQUES D 1 INDEXATION AUTOMATIQUE. 
L'ORIGINALITE DE NOTRE SYSTEME RESIDE DANS LA PRISE EN COMPTE SIMULTANEE DE 
DEUX ASPECTS DU DOCUMENT : - LA STRUCTURE LOGIQUE DU DOCUMENT : CE 
PROTOTYPE N' IMPOSE AUCUNE CONTRAI NTE PARTICULIERE SUR LA STRUCTURE DU 
DOCUMENT, ET PERMET DE TRAITER TOUT ENSEMBLE DE TEXTES COMPOSES DE MANIERE 
HIERARCHIQUE , - LES RELATIONS EXTRAITES D 1 UNE ANALYSE DE COOCCURRENCES 

DES GROUPES NOMINAUX DU TEXTE UN THESAURUS EST CONSTITUE 

AUTOMATIQUEMENT A PARTIR DES TEXTES ANALYSES. NOTRE SYSTEME A ETE TESTE SUR 
DEUX CORPUS DE NATURE ASSEZ DIFFERENTE TANT PAR LEUR CONTENU QUE PAR LEUR 
STRUCTURE ; LES PREMIERS RESULTATS SEMBLENT ENCOURAGEANTS . LA REALISATION 
DE CE SYSTEME NOUS A PERMIS D'ENTREVOIR CERTAINS PROBLEMES LIES AUX 
TECHNIQUES DE TRAITEMENT AUTOMATIQUE DU LANGAGE NATUREL. NOUS PENSONS QUE 
LES TECHNIQUES STATISTIQUES ET LINGUISTIQUES SE COMBINENT AVANTAGEUSEMENT 
DANS LE CADRE D'UN SYSTEME DOCUMENTAIRE. CEPENDANT, ALORS QU'lL EXISTE DE 
NOMBREUX PROGRAMMES STATISTIQUES D'ETIQUETAGE GRAMMATICAL POUR L" ANGLAIS, 
PEU DE TRAVAUX ONT ETE MENES DANS CE SENS POUR LE FRANCAIS. LE SYSTEME 
DEVELOPPE DANS CETTE THESE FOURNIT UNE PREMIERE VERSION DU THESAURUS. NOTRE 
OBJECTIF FINAL EST QU'lL PUISSE ETRE CONSIDERE COMME UNE VERITABLE BASE DE 
CONNAISSANCE DU DOMAINE. 

English Descriptors: Automatic indexing; Thesaurus; Automation; Automated 
processing; Document retrieval system; information retrieval; Document 
structure; Linguistic analysis; Natural language; Automatic processing 

French Descriptors: indexation automatique; Thesaurus; Automatization; 
Traitement automatise; Systeme documentai re ; Recherche information; 
Structure document; Analyse linguistique; TAL; Langage naturel ; 
Traitement automatique 



Classification Codes: OOIaOIeOIb; 205 
Copyright (c) 1998 inist-cnrs. All rights reserved. 



15/5/9 (Item 2 from file: 144) 
DIALOG (R) File 144: Pascal 
(c) 2008 INIST/CNRS. All rts. reserv. 

13012509 PASCAL No.: 97-0296001 

why words and co-words cannot map the development of the sciences 

LEYDESDORFF L 

Department of Science and Technology Dynamics, Nieuwe Achtergracht 166, 
1018 wv Amsterdam, Netherlands 

Journal: Journal of the American Society for information Science, 1997, 
48 (5) 418-427 

ISSN: 0002-8231 CODEN: AISJB6 Availability: INIST-6025; 
354000065515360030 
No. of Refs. : 32 ref . 

Document Type: P (Serial) ; A (Analytic) 
Country of Publication: united States 
Language: English 

A restricted set of full- text articles from a sub-specialty of 
biochemistry was analyzed and compared in terms of co-occurrences and 
co-absences of words. By using the distribution of words over the sections, 
a clear distinction among "theoretical" "observational," and 
"methodological" terminology can be made in individual articles. However, 
at the level of the set this structure is no longer retrievable: words 
change both in terms of frequencies of relations with other words, and in 
terms of positional meaning from one text to another. These results accord 
with Hesse's (1980) thesis about the sciences as fluid networks. The 
fluidity of networks in which nodes and links may change positions is 



expected to destabilize representations of developments of the sciences on 
the basis of co-occurrences and co-absences of words. The consequences for 
the lexicographical approach to generating artificial intelligence from 
scientific texts are discussed 

English Descriptors: Scientific literature; Content analysis; Bibliometrics 
; Biochemistry; Sentence ; Discriminant analysis; Graphics; Models; 
Sample; Artificial intelligence; Lexicography; Cooccurrence analysis; 
Coword; information representation; Bibliometric map 
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cooccurrence; Mot associe; Representation information; Carte 
bibliometrique 
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ABSTRACT: 

Kana-to-kanji (phonogram-to-ideogram) conversion technology is nowadays 
common in Japanese word processor development. However, correct conversion 
without human interaction is still quite difficult because of the existence 
of many homonyms, we propose a new method to process homonyms on the basis 
of the co - occurrence relation between a noun and a verb in a sentence 
. Our method is based on the idea that nouns which co-occur in a simple 
sentence share the sentence-final verb as a governor, therefore, the most 
feasible candidates of kanji nouns in an input simple sentence are those 
each of which co-occurs with an identical verb in a simple sentence with 
the highest frequency. An experimental kana-to-kanji conversion using our 
new method for 1129 simple sentences has shown that the conversion is 
carried out in 93.3% of the sentences and that the accuracy is 63.0%. Our 
method is shown to be more effective than the ordinary method based on word 
occurrence frequency. 

DESCRIPTORS: CHARACTER SET ; MESSAGE PROCESSING; ALGORITHM; CHARACTER 
RECOGNITION; CHARACTER GENERATORS 

IDENTIFIERS: JAPANESE HOMONYMS; WORD COOCCURRENCE ; SIMPLE SENTENCE ; 
KANA TO KANJI CONVERSION; JAPANESE WORD PROCESSORS; KANJI NOUNS; HOMONYM; 
Textverarbeitung ; japanisches Homonym; zeichensatz 
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FILE SEGMENT: Computer & information Systems Abstracts 
ABSTRACT : 

we propose two Japanese information retrieval methods that enhance 
retrieval effectiveness using relationships between words. One is a method 
using dependency relationships between words in a sentence , and another 
is a method using the ordered co - occurrence information of words in a 
sentence as an approximation to the dependency relationships between them. 
Through retrieval experiments using the Japanese test collection for 
information retrieval systems NTCIR-I, we showed our two methods are 
superior to the TF-IDF method in retrieval effectiveness and the difference 
between our two methods is small. These results are independent of the 
document set and of the search topic set. 



18/5/1 (item 1 from file: 8) 
DIALOG (R) File 8:Ei Compendex(R) 

(c) 2008 Elsevier Eng. info. inc. All rts. reserv. 

09894160 E.I. No: EIP04248209159 
Title: information content in Medline record fields 

Author: Kostoff, Ronald N. ; Block, Joel A.; Stump, Jesse A.; Pfeil, 
Kirstin M. 

Corporate Source: Office of Naval Research, Arlington, VA 22217, united 
States 

Source: international Journal of Medical informatics v 73 n 6 Jun 30 
2004. p 515-527 

Publication Year: 2004 

CODEN: UMIF4 ISSN: 1386-5056 

Language: English 

Document Type: JA; (journal Article) Treatment: T; (Theoretical) 
Journal Announcement: 0406W4 

Abstract: Background: The authors have been conducting text mining 
analyses (extraction of useful information from text) of Medline records, 
using Abstracts as the main data source. For literature-based 
discovery, and other text mining applications as well, all records in a 
discipline need to be evaluated for determining prior art. Many Medline 
records do not contain Abstracts, but typically contain Titles and Mesh 
terms. Substitution of these fields for Abstracts in the non-Abstract 
records would restore the missing literature to some degree. Objectives: 
Determine how well the information content of Title and Mesh fields 
approximates that of Abstracts in Medline records. Approach: Select 
historical Medline records related to Raynaud's Phenomenon that contain 
Abstracts. Determine the information content in the Abstract fields 
through text mining. Then, determine the information content in the Title 
fields, the Mesh fields, and the combined Title-Mesh fields, and compare 
with the information content in the Abstracts. Results: Four metrics were 
used to compare the information content related to Raynaud's Phenomenon in 
the different fields: total number of phrases; number of unique phrases; 
content of factors from factor analyses; content of clusters from 
multi-link clustering . The Abstract field contains almost an order of 
magnitude more phrases than the other fields, and slightly more than an 
order of magnitude more unique phrases than the other fields. Each field 
used a factor matrix with 14 factors, and the combination of all 56 
factors for the four fields represented 27 separate, but not unique, 
themes. These themes could be placed in two major categories, with two 
sub-categories per major category: Auto-immunity (antibodies, inflammation) 
and circulation (peripheral vessel circulation, coronary vessel 
circulation). All four sub-categories included representation from each 
field. Thus, while the focus of the representation of each field in each 
sub-category was moderately different, the four sub-category structure 
could be identified by analyzing the total factors in each field, in the 
cluster comparison phase of the study, the phrases used to create the 
clusters were the most important phrases identified for each factor. Thus, 
the factor matrix served as a filter for words used for clustering, while 
clusters were generated for all four fields, the Title hierarchy tended to 
be fragmented due to sparsity of the co - occurrence matrix that 
underlies the clusters. Therefore, the Title clusters were examined at only 
the lower levels of aggregation. The Abstract, Mesh, and Mesh + Title 
fields had the same first level taxonomy categories, auto-immunity and 
circulation. At the second level, the Abstract, Mesh, and Mesh + Title 
fields had the autoimmune diseases and antibodies sub-category in common. 
The Abstract and Mesh fields shared fascia inflammation as the other 
auto-immunity sub-category, while the other Mesh + Title sub-category 
focuses on vinyl chloride poisoning from industrial contact, and 
consequences of antineoplastic agents. However, in both cases, even though 
the words may be different, inflammation may be the common theme. 
Conclusions: For taxonomy generation, especially at the higher levels, 
each of the four fields has a similar thematic structure. At very detailed 
levels, the Mesh and Title fields run out of phrases relative to the 
Abstract field. Therefore, selection of field (s) to be employed for 
taxonomy generation depends on the objectives of the study, particularly 
the level of categorization required for the taxonomy. For information 
retrieval, or literature-based discovery, selection of the appropriate 
field again depends on the study objectives, if large queries, or large 
numbers of concepts or themes are desired, then the field with the largest 
number of technical phrases would be desirable, if queries or concepts 



represented by the more accepted popular terminology is adequate, then the 
smaller fields may be sufficient. Because of its established and 
controlled vocabulary, the Mesh field lags the Title or Abstract fields in 
currency. Thus, the Title or Abstract fields would retrieve records with 
the most explicitly stated current concepts, but the Mesh field would 
capture a larger swath of fields that contained a concept of interest but 
perhaps had a wider range of specific terminology in the Abstract or Title 
text, in addition, this study provides the first validated estimate of the 
disparity in information retrieved through text mining limited to Titles 
and Mesh terms relative to entire Abstracts. As much of the older 
biomedical literature was entered into electronic databases without 
associated Abstracts, literature-based discovery exercises that search the 
older medical literature may miss a substantial proportion of relevant 
information. On the basis of this study, it may be estimated that up to a 
log order more information may be retrieved when complete Abstracts are 
searched. 24 Refs. 

Descriptors: -Medical imaging; information science; Data reduction; 
Abstracting; Antibodies; Database systems; Matrix algebra 

identifiers: Electronic databases; Data sources 

Classification Codes: 

461.9.1 (immunology) 

461.1 (Biomedical Engineering); 723.2 (Data Processing); 903.1 
(information Sources & Analysis); 461.9 (Biology); 723.3 (Database 
Systems); 921.1 (Algebra) 

461 (Bioengineering) ; 903 (information Science); 723 (Computer 
Software, Data Handling & Applications); 921 (Applied Mathematics) 

46 (BIOENGINEERING); 90 (ENGINEERING, GENERAL); 72 (COMPUTERS & DATA 
PROCESSING) ; 92 (ENGINEERING MATHEMATICS) 



18/5/2 (Item 2 from file: 8) 
DIALOG (R) File 8:Ei Compendex(R) 
(c) 2008 Elsevier Eng. Info. inc. All rts. reserv. 

08147905 E.I. No: EIP98114433514 

Title: Texture classification of engineering surfaces with nanoscale 
roughness 

Author: Grigoriev, A.Ya. ; Chizhik, S.A. ; Myshkin, N.K. 
Corporate Source: Belarus Acad of Sciences, Gomel, Byelorussia 
Source: International journal of Machine Tools & Manufacture v 38 n 5-6 
May-Jun 1998. p 719-724 
Publication Year: 1998 
CODEN: IMTME3 ISSN: 0890-6955 
Language: English 

Document Type: JA; (journal Article) Treatment: G; (General Review) 
Journal Announcement: 9812W4 

Abstract: The spatial structure of the surface layer, or texture is 
important for surface topography characterization, in many respects a 
texture determines contact behavior of the rough surfaces. Despite 
increasing role of the precision mechanics, the texture of engineering 
surfaces have not been adequately investigated, in this paper pattern 
recognition theory is introduced to perform surface textures 
classification. The height-coded images obtained by atomic force microscopy 
were used as initial data . The images represent the surface textures of 
various materials formed by various processes, we take the following 
procedure for the texture classification. First, the texture was 
characterized by a matrix of co - occurrence of image contrast. Next, the 
matrix is transformed into feature vector by the Karhunen-Loeve 
transformation. The feature vector was considered as coordinates of a point 
in the multidimensional feature space. The location of the point depends on 
the peculiarities of the surface texture . The set of the points form 
clusters that correspond to different classes of textures. The mutual 
arrangement of the points and structure of the clusters were analyzed by 
the multidimensional scaling procedure. It was founded that there is at 
least four classes of surface relives. The first three of them related to 
the properties of surface material and the last to the process of growth 
and crystallization on the interface of different materials. (Author 
abstract) 18 Refs. 
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Learning disabled youth represent the largest portion of special 
education youth in juvenile corrections. Race, gender, poverty, and urban 
living are all factors shown to increase the likelihood of being classified 
as learning disabled and identified for placement in special education, and 
at-risk for school failure. Research demonstrates that at-risk youth are 
often several years below grade level in one or more academic areas, have 
higher absenteeism rates, increased grade-level retention, higher dropout 
rates, and poorer post-school outcomes, including lower levels of 
meaningful employment, and higher arrest, incarceration, and recidivism 
rates, while research has identified a variety of factors linking academic 
failure and delinquency, it has been grounded in the identification of 
person-centered factors rather than external factors such as those found in 
the home, school, and community. The present study is a qualitative inquiry 
of adolescent learning disabled youths' perceptions of their past public 
schooling experiences. The study expands an examination of schooling to 
include the additional high-risk contexts of the youths' home and 
community, in-depth interviews were the primary data collection tool 
used for accessing the stories of these adolescent juveniles. The findings 
of the study suggest that youth responded and made decisions relative to 
their needs and socially stigmatized positioning as learning disabled 
students; key events co - occurring with the tasks challenges, and coping 
abilities of adolescence contributed to their interrupted school careers; 
and, persistent home, school, and community risk factors exceeded 
protective factors available to these adolescent students, limiting their 
ability to successfully adapt and respond. The distal impact of these 
factors on their development is demonstrated in their poor educational 
outcomes and increased incidences of court involvement. Students receive 
insufficient opportunities in their risk-prone contexts for a level of 
social development that lays the groundwork for adjustment and competence 
in adolescence and as they move toward adult roles. 
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The Palo Duro mouse, Peromyscus truei comanche, is known to occur only 
in the Texas Panhandle along the eastern "Caprock" escarpment of the Llano 
Estacado and is currently listed as threatened by the state of Texas. 
Previous studies to determine specific habitat associations that limit the 
species' distribution have been inconclusive. Remotely sensed data and 
Geographic information Systems (GIS) were used to identify and characterize 
the habitat of this species, and predict the distribution of P. t. comanche 
along the Llano Estacado based on collection data , known habitat 
preferences, and vegetational and soil distributions. Spatial data sources 
employed for characterization were digitized locational information, 
1:250,000 Land use and Land Cover (LULC) data, and 1:250,000 State Soil 
Geographic Data Base (STATSGO) data . The co - occurrence of specific 
vegetation and soil types with recognized collection localities was used to 
identify and characterize the habitat, and predict the distribution of P. 
t. comanche. 
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Paradoxically, the explosion of scientific information has resulted in 
diminishing awareness, in the face of an ever growing body of literature, 
disciplines are becoming increasingly specialized, while individuals and 
groups are becoming ever more insular. The availability of scientific 
bibliographies in online databases is a rich source of scientific 
information for scientists to support their research, in this paper, we 
propose a new method to predict new association rules of concepts by mining 
current scientific literature, in contrast to previous related research, 
our method's novelties are as follows: extend the antecedent and consequent 
of an association rule from a concept to a set of concepts; measure the 
relationship between two concepts not only by their co - occurrence in 
scientific literature, but also by their inherent relationship in knowledge 
bases; describe the appropriate degree of replacing a concept with its 
sibling; propose some indicators to distinguish various valid changes of 
existing association rules. The predicted new association rules can serve 
researchers as major repositories of candidates for new research themes, as 
impetus for inspiration impetus, or as hypotheses to be tested in future. 
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This paper reports the results of our experimental study on a new method 
of applying an association rule miner to discover useful information from a 
text database. It has been claimed that association rule mining is not 
suited for text mining. To overcome this problem, we propose (1) to 
generate a sequential data set of words with dependency structure from 
a Japanese text database, and (2) to employ a new method for extracting 
meaningful association rules by applying a new rule selection criterion. 
Each inquiry was converted to a list of word pairs, having dependency 
relationship in the original sentence. The association rules were acquired 
regarding each pair of words as an item. The rule selection criterion 
derived from our principle of giving heavier weights to co - occurrence 
of multiple items than to single item occurrence, we regarded a rule as 
important if the existence of the items in the rule body significantly 
affected the occurrence of the item in the rule head. Based on this method, 
we conducted experiments on a customer inquiry database in a call center of 
a company and successfully acquired practical meaningful rules, which were 
not too general nor appeared only rarely. Also, they were not acquired by 
only simple keyword retrieval. Additionally, inquiries with multiple 
aspects were properly classified into corresponding multiple categories. 
Furthermore, we compared (i) rules obtained from a sequential data set 
of words with dependency structure, which we propose in this paper, and 
those without dependency structure, as well as (ii) rules acquired through 
the association rule selection criterion and those through the conventional 
criteria. As a result, discovery of meaningful rules increased 14.3-fold in 
the first comparison, and we confirmed that our criterion enables to obtain 
rules according to the objectives more precisely in the second comparison. 
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Aim information on the community composition, structure, and dynamics of 
epiphyte vegetation is scarce. A survey of the epiphytes occurring on all 
individuals of one particular host tree species in a well-studied 
neotropical research site allowed us a comparison of the epiphyte flora of 
this tree with the local epiphyte flora, the analysis of spatial 
distribution patterns and the use of these patterns as indications for 
changes in time, in the future, our results can be used as a baseline 

data - set for the direct observation of the long-term dynamics in 
epiphyte communities. Location The study was conducted on Barro Colorado 
island (BCI) , Panama. Methods we recorded all individuals of the vascular 
epiphytes growing on Annona glabra L. , a flood-tolerant, multiple-stemmed 
tree, which is restricted to the shoreline of BCI. Data on tree biometrics, 
epiphyte species, and epiphyte abundances were collected for more than 1200 
trees. Results in total, we encountered almost 15,000 epiphytic individuals 
in sixty-eight species, corresponding to more than one third of the entire 
epiphyte flora of Barro Colorado island. The component species differed 
strongly in abundance: the four most important species accounted for >75% 
of all individuals, in most cases, the same four species were also the 
first to colonize a tree (=phorophyte) . Colonization patterns indicated no 
replacement of early colonizers by late arrivals. Species richness and 
epiphyte abundances showed a positive correlation with the size and the 
density of the host trees. All species showed a highly clumped distribution 
and the physiognomy of epiphyte communities of individual trees was 
dominated either by one or several of the four most common species or by a 
set of frequently co - occurring tank bromeliads. Other species were 
dominant only in exceptional cases. Most species were always rare. A 
distance effect on community composition was mostly confined to a local 
scale with an increased similarity in the species assemblage of stems of a 
tree v. neighbouring trees. Main conclusions The epiphytes on a single 
small phorophyte species may encompass a surprisingly large proportion of 
the local epiphyte flora. The observations that most tree crowns are 
inhabited by a single or only very few species, and that all epiphyte 
species show highly clumped distributions suggest a predominance of very 
local dispersal within a tree crown, which is only infrequently interrupted 
by successful long-distance dispersal between crowns. 
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Objective: To examine temporal priority in the relationship between 
psychiatric disorders and drug use. Method: Psychiatric assessments and 
drug use were completed at three different points in time, spanning 9 
years. Structured interviews were administered to a cohort of youths and 
their mothers. Subjects were selected on the basis of their residence in 
either of two counties in upstate New York. The sample was predominantly 
white male and female youths, aged 1 through 10 years upon initial 

collection of data . Psychiatric diagnoses were assessed by a 
supplemented version of the Diagnostic interview Schedule for Children 
version 1, using computer algorithms designed to match DSM-III-R criteria 
to combine information from mothers and youths. Substance use information 
was obtained in the interviews. Results: A significant relationship was 
found to exist between earlier adolescent drug use and later depressive and 
disruptive disorders in young adulthood, controlling for earlier 
psychiatric disorders. Earlier psychiatric disorders did not predict 
changes in young adult drug use. Conclusions: Implications for policy, 
prevention, and treatment include (1) more medical attention needs to be 
given to the use of legal and illegal drugs; and (2) a decrease in drug use 
may result in a decrease in the incidence of later psychiatric disorders. 
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Autism is a behavior disorder with genetic influences indicated from twin 
and family studies and from the cooccurrence of autism with known genetic 



disorders. Tuberous sclerosis complex (TSC) is a known genetic disorder 
with behavioral manifestations including autism. A literature review of 
these two disorders substantiates a significant association of autism and 
TSC with 17-58% of TSC subjects manifesting autism and 0.4-3% of autistic 
subjects having TSC. in initial data collected on 13 TSC probands and 
14 autistic probands in our family study of autism and TSC, we identified 7 
TSC subjects with autism 
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SUMMARY: DESCRIPTION (provided by candidate): I am applying for the 
Mentored Research Scientist Development Award (K01) program to promote my 
growth as an independent scientist. My goal with this award is to obtain 
the advanced training I need to develop an evidence-based treatment manual 
for a social skills intervention for children with autism spectrum 
disorders (ASD) , and subsequently apply for an R01 to test these 
interventions in clinical trials. Children and adolescents with ASD who are 
cognitively 'high-functioning' frequently present with symptoms of anxiety, 
in addition to the core impairment in social interaction. This anxiety can 
interfere with their ability to integrate into mainstream academic 
environments, undermine their use of appropriate social skills in natural 
peer contexts, and impede their overall development. Thus, anxiety can be 
seen as compounding the social disability inherent in spectrum disorders, 
preventing otherwise able children from reaping the maximum benefit from 
interventions targeting social skill development. The K01 career 
development aims will allow me to gain additional instruction, mentoring 
and experience in: (1) assessment and treatment of ASD and childhood 
anxiety; (2) the design of psychosocial treatment manuals; (3) methods and 
statistical techniques appropriate for the design and conduct of randomized 
controlled trials of psychosocial interventions; and (4) responsible and 
ethical conduct of research. These training objectives relate directly to 
my research plan, the ultimate goal of which is to develop an 
evidence-based, efficacious treatment program for children with ASD that 
targets social skill development and anxiety reduction. The aims of this 
research are: (1) to develop an alpha version of a treatment manual that 
addresses social skill development and co - occurring anxiety in 
school -age children and adolescents with ASD; (2) to pilot strategies 
comprising the treatment manual with a small group (n=5) of children to 
refine intervention strategies and delivery; (3) to collect preliminary 
data on the short-term efficacy, as well as feasibility, of this 
structured manual -based treatment in a sample (n=24) of children with ASD 
complicated by anxiety; and (4) to develop a grant application to conduct a 
larger scale efficacy study of the treatment curriculum. Through this 
training and research plan, I will be well -positioned to carry out 
independent investigations designed to translate an empirical understanding 
of anxiety and social disability in children with high-functioning ASD into 
novel treatment approaches. 



DESCRIPTORS: child psychology; adolescence (12-20); middle childhood 
(6-11); clinical trial; human subject; anxiety; autism; social behavior 
disorder; social behavior; handbook; human therapy evaluation; cognitive 
behavior therapy; clinical research; behavioral /social science research 
tag; Asperger syndrome; therapy design /development 
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Toward a Rat Model of Alcohol Abuse in Schizophrenia 

PRINCIPAL INVESTIGATOR: CHAU , DAVID 

ADDRESS: david.t.chau@dartmouth.edu David Thanh Chau Dartmouth Medical 
School Lebanon, NH 03756 

PERFORMING ORG.: DARTMOUTH COLLEGE, HANOVER, NEW HAMPSHIRE 
SPONSORING ORG.: NATIONAL INSTITUTE OF MENTAL HEALTH 

DATES: 2008/15/07 TO 2004/30/09 FY : 2007 TYPE OF AWARD: New Award 
(Type 1) 

SUMMARY: DESCRIPTION (provided by applicant): Alcohol use disorder 
commonly occurs among patients with schizophrenia and contributes greatly 
to the morbidity of schizophrenia. Patients with schizophrenia tend to 
consume modest quantities of alcohol on a regular basis and are less likely 
to develop alcohol dependence than alcohol abuse, but even this modest use 
of alcohol dramatically worsens their symptoms and decreases their overall 
functioning. Green (co- investigator) and colleagues have suggested that 
such moderate alcohol use may, however, transiently ameliorate a brain 
reward circuit deficiency that underlies alcohol use disorder in these 
patients, unfortunately, available treatments for co - occurring alcohol 
use disorder in schizophrenia are very limited. This revised R03 proposal 
seeks to begin a line of research toward the development of such an animal 
model of alcohol use disorder in schizophrenia, an animal that exhibits 
characteristics of schizophrenia, and like patients with schizophrenia, 
also drinks at least moderate amounts of alcohol. To develop this animal 
model, we propose to use, as a base, a rat with a neonatal ventral 
hippocampal lesion (the NVHL rat), a well-established animal model of 
schizophrenia, a rodent that as an adult exhibits requisite characteristics 
of schizophrenia, demonstrates abnormalities in its brain reward circuit, 
and, interestingly, has recently been shown to exhibit increased cocaine 
self-administration. Our preliminary data in a small group of adult 
NVHL rats also suggest that this rat will voluntarily drink at least 
moderate amounts of alcohol. This revised research proposal seeks to 
further probe the potential role of the NVHL rat as an animal model of 
schizophrenia and comorbid alcohol use disorder, using free-access 
conditions, we will: (1) compare the amount and preference of alcohol 
drinking [and blood alcohol level] in NVHL rats versus sham-operated rats; 
and (2) compare the size, frequency and temporal distribution of alcohol 
drinking bouts in NVHL rats versus sham-operated rats, if NVHL rats can be 
differentiated from sham rats according to these measures, we plan to 
continue research with NVHL rats in subsequent studies to: (1) explore 
mechanisms mediating alcohol drinking in these animals (e.g., to address 
the question of whether alcohol use serves to transiently ameliorate a 
deficit in brain reward functioning); and (2) screen medications that might 
be able to decrease alcohol drinking in this rat. ultimately, we expect to 
translate the findings from our studies with the NVHL rat into studies 
involving human subjects, with the long- term goal of this research to find 
novel medications to treat patients with schizophrenia and alcohol use 
disorder, and thus to improve the outcome of these patients. Alcohol use 
disorder occurs commonly among patients with schizophrenia and greatly 
worsens the overall functioning of these patients. This research seeks to 
develop an animal model of alcohol use disorder in schizophrenia, an animal 
model that exhibits schizophrenia-like characteristics as well as increased 
alcohol drinking. This animal model, when developed, will be used: (1) to 
elucidate the underlying basis of alcohol use disorder in patients with 
schizophrenia; and, (2) to develop novel medications to limit alcohol use 
in these patients. 

DESCRIPTORS: alcoholism /alcohol abuse; gamma aminobutyrate; laboratory 
rat; hippocampus; experimental brain lesion; self medication; disease 
/disorder model; dopamine; behavior test; preference; reinforcer; 
antipsychotic agent; schizophrenia; comorbidity; behavioral /social science 



research tag; substance abuse related behavior 
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Development of a Treatment Adherence Program for Bipolar Substance 
Abusers 

PRINCIPAL INVESTIGATOR: MILLER, IVAN W. 

ADDRESS: Ivan Miller@brown.edu Butler Hospital Providence, Rl 02906 
PERFORMING ORG.: BUTLER HOSPITAL (PROVIDENCE, Rl) , PROVIDENCE, RHODE 
ISLAND 

SPONSORING ORG.: NATIONAL INSTITUTE ON DRUG ABUSE 

DATES: 2009/30/07 TO 2007/31/10 FY : 2007 TYPE OF AWARD: New Award 
(Type 1) 

SUMMARY: DESCRIPTION (provided by applicant): There is a substantial co 
- occurrence between substance use disorders and bipolar disorder. Bipolar 
disorder is five to eight times more likely to occur in patients with 
substance use disorders than in the general population. Conversely, rates 
of substance use disorders in bipolar samples have been reported to be as 
high as 60% or more. This comorbidity is associated with an earlier age of 
illness onset, increased symptom severity, greater tendency for violence, 
higher rates of psychiatric hospitalization, slower time to remission of 
acute mood episodes, poorer response to lithium treatment, and increased 
suicidality and mortality rates. Although treatment nonadherence is a 
significant problem in both substance use and bipolar disorders 
independently, the co - occurrence of these conditions is related to even 
poorer compliance rates. Further, research indicates that bipolar substance 
abusers have a worse course of illness compared to noncomorbid patients, 
and nonadherence is the most consistent predictor of these poor outcomes. 
To date, there is very little research on behavioral interventions 
specifically designed to improve treatment adherence in this high-risk, 
comorbid population. The present proposal is designed to meet the 
objectives of Stage I of the nida Behavioral and integrative Therapies 
Development Program (PA-07-111) . Stage I is the initial stage of treatment 
development research: to formulate new behavioral therapies; to 
operationally define therapy manuals and procedures; and to pilot test and 
refine new therapies. The overall aim of this study is to develop the 
integrated Treatment Adherence Program (ITAP) , which is designed as an 
adjunctive intervention for improving treatment adherence (broadly defined) 
in bipolar substance abusers, and to collect preliminary data on the 
feasibility, acceptability, and initial efficacy of the program. More 
specifically, we propose the following major aims: 1. To develop a 
comprehensive treatment manual for ITAP - an innovative, multi-modal 
intervention that combines motivational, family, and telephone-based 
strategies - by conducting a small open trial (n = 15) with patients with 
drug dependence and bipolar disorder. 2. To conduct a randomized controlled 
pilot study in a sample (n = 60) of patients initially hospitalized with 
comorbid substance dependence and bipolar disorder by comparing ITAP to 
treatment as usual to estimate relevant treatment parameters (e.g., 
acceptability, preliminary efficacy), including HIV risk behaviors. This 
pilot study will lay the groundwork for a larger clinical trial (Stage II) 
evaluating the efficacy of this new treatment for improving adherence in 
bipolar substance abusers. 
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HERBIVORE-MEDIATED INDIRECT EFFECTS OF AN EXOTIC THISTLE ON NATIVE 
THISTLES. 

spatial distribution 

ASSOCIATE INVESTIGATORS: Louda, S. M. 

PERFORMING ORG.: UNIVERSITY OF NEBRASKA, SCHOOL OF BIOLOGICAL SCIENCES, 
LINCOLN, NEBRASKA 68583 

TYPE OF AWARD: NRI COMPETITIVE GRANT |c C 



SUMMARY: The overall objective of this project is evaluate the 
interaction between an invasive exotic weed and two related native plant 
species mediated by a shared insect herbivore, a deliberately released 
biocontrol weevil, Rhinocyllus conicus, in prairie rangelands. The weevil 
is here and having significant impacts on prairie species. Can this impact 
be managed and, if so, how? The preliminary data suggested that R. 
conicus reduced its use and impact on native species in the vicinity of its 
preferred, exotic host plant, musk thistle (corduus nutans spp.). Thus, the 
objectives of the research are to: 1) evaluate the generality of this 
observation in Nebraska grasslands, 2) develop a 

better mechanistic, experimentally-based understanding of 
insect-mediated indirect interactions between plant species within this 
system, and 3) examine the applicability of that new data to managing the 
impact of R. conicus on sparse or rare native plant species. The study is 
designed to answer three fundamental questions: 1) how are seed losses of 
native thistle species related to ecological circumstances, such as 
proximity to stands of the targeted weed, musk thistle, and surrounding 
vegetation; 2) are plant co - occurrence and observed levels of impact 
causally related; and, 3) can the ecological factors be manipulated to 
minimize negative impacts on rate native 

species. The aim is to improve our basic understanding of 
herbivore-mediated indirect interactions between co - occurring plants 
and apply that understanding to science-based management of non-target 
effects associated with the biological control of invasive plant species, 
such as thistles. The study entails both data collection on the pattern 
of injury inflicted by Rhinocyllus conicus on native thistles in prairie 
grasslands and the response of R. conicus to native species in 
experimentally planted arrays. The hypotheses to be tested are that co - 
occurrence of the native species with musk thistle: (Hi) has no effect on 
seed loss of the native, or (h2) decreases seed loss 

by the native (="associational defense"), or (h3) increases seed loss by 
the native (="associational susceptibility"). The patterns will be 
documented in relation to proximity to musk thistle (Carduus nutons ssp. 
leiophyllous) stands as well as variation in weevil densities and identity 
of the ambient plant community. The experiments will determine directly the 
degree to which native plant use is influenced by proximity to stands of 
the targeted, preferred weed species. PR factors, specifically local density 
and proximity of the exotic weed musk thistle (Carduus nutans), affect 
non-target damage to native species such as wavyleaf thistle (Cirsium 
undulatum), by Rninocyllus conicus, a 

European weevil used as a biocontrol agent for musk thistle. Theory 
predicts that co - occurrence with musk thistle could increase or 
decrease R. conicus damage to native thistles. Damage may be less where 
wavyleaf and musk thistles co-occur (associational defense), if weevils are 
drawn from the native by their preferred host, musk thistle. Alternately, 
co - occurrence may increase damage to native thistles (associational 
susceptibility), if weevils on musk thistle spillover onto natives. Our 
results should help reduce R. conicus non-target effects by providing data 
needed to evaluate two management strategies: 1) reduce musk thistle 
abundance to minimize weevil spillover onto 

natives vs. 2) use musk thistles as a "trap crop' to draw weevils from 
natives, we quantified R. conicus use of the wavyleaf thistle in three 
ways: regional surveys (2001-2003), experimental manipulations of R. 
conicus (2002-2003), and quantification of R. conicus oviposition in 
relation wavyleaf density on loess soils (2004) in southwest Nebraska. 
Earlier data from the Sand Hills, where musk thistle is very rare, showed 
more R. conicus damage to wavyleaf thistles in dense patches (>5 stems in a 
3 m radius) than to isolated plants (>20 m from any bolting thistle). Our 
results pr 

PROGRESS REPORT SUMMARY: Russell , F. L. and S. M. Louda. 2004. 
Phenological synchrony aff ectsi nteraction strength of an exotic weevil with 
Platte thistle, a nativehost plant. Oecologia 139 : 525-534Rand , T. A., F. L. 
Russell and S. M. Louda. 2004. Local vs. landscapescale indirect effects of 
an invasive weed on native plants. weedTechnology 18 : 1250-1254 . Louda, S. 
M., T. A. Rand, F. L. Russell and A. E. Arnett. 2005. 

Assessment of Ecological Risks in Biocontrol: input from 
Retrospect!' veological Analyses. Biological Control: in press . Russell , F. L. 
and S. M. Louda. 2005. insect abundance, phenology andassociational defense 
influence floral herbivory by an invasive insect. Oecologia. in 
review. Russell , F. L., S. M. Louda and T. A. Rand. 2005. variation 
inherbivore-mediated indirect effects of an invasive plant on a 
nativeplant. in draft. 



Russell, F. L. and S. M. Louda. 2006. Spatial variation in 
Rhinocyllusconicus response to density of an adopted native host plant, 
Cirsiumundulatum. in prep. Russell, F. L. and S. M. Louda. 2006. Does weed 
density explainvariation in Rhinocyllus conicus damage to a target host 
plant, Carduusnutans (musk thistle)? in prep. 

DESCRIPTORS: herbivores; insects; environmental impact; defense 
mechanisms; invasive species; plant competition; weed control; non target 
organisms; biological control (weeds); prairies; rangelands; rhinocyllus 
conicus; cirsium; carduus nutans; risk management; plant ecology; insect 
ecology; exotic plants; native plants; ecosystem management; plant damage; 
spatial distribution 
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Atypical Antipsychotic use and Smoking Cessation in Those with Bipolar 
Disorder and Schizophrenia 

PRINCIPAL INVESTIGATOR: Matthews, Annette M, M.D. 

PERFORMING ORG.: Department of veterans Affairs, Medical Center, 
Portland, OR 

SPONSORING ORG.: Department of veterans Affairs, Research and Development 
(15), 810 Vermont Ave. N.W., Washington, D.C. 20420 united States of 
America 

DATES: 20070101 

SUMMARY: SMOKING; ANTIPSYCHOTIC AGENTS; BIPOLAR DISORDER; SCHIZOPHRENIA 
OBJECTIVES: To determine if atypical antipsychotic use is associated 
with a decreased rate of smoking cessation in those with bipolar disorder 
as compared to those with schizophrenia. 

PLAN: we will collect data on patients treated for bipolar disorder 
in the Veterans Administration visn 20 between January 2000 and December 
2005. VISN 20 includes 8 medical centers and 17 community-based 
outpatientclinics (CBOCs) distributed throughout Alaska, Washington, 
Oregon, and Idaho. We quire the data by downloading it from the VISN 20 
data warehouse into a local database using structured query language (SQL) 
queries to organized it and exported to SPSS 14.0 for analysis. We will 
compare those with schizophrenia on atypical antipsychotics with those with 
bipolar disorder on atypical antipsychotics who smoke. 

METHODS: We will use the Cox proportional hazards models to compare 
thetwo groups, we will examine time from baseline, January 2000, until sm 
oking cessation, where smoking cessation is measured by the subjects answer 
an annual required smoking cessation alert, we will control for participant 
characteristics such as age, medications, and co - occurring conditions. 
Diagnosis (bipolar disorder versus schizophrenia) will be included as a 
predictor. Medications will be entered into the models as time-varying 
co-variats. From a preliminary data analysis we know that we will be 
able to look at data for at least 300 people in each group of interest. 
Portland VA Medical Center institutional Review Board approval for data 
collection and analysis will be obtained before the data is collected 

FINDINGS TO DATE: none. 

*** PDS Report: initial; Report Date: 01/01/07; Submitted: 06/21/07 *** 
initial Report 

DESCRIPTORS: SMOKING; ANTIPSYCHOTIC AGENTS; BIPOLAR DISORDER; 

SCHIZOPHRENIA 
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Archival Data Analysis of Traumatic Brain injury and Co-existing 
Psychiatric Illness in veterans 

PRINCIPAL INVESTIGATOR: Yesavage, Jerome A., M.D. 

PERFORMING ORG.: Department of veterans Affairs, Medical Center, Palo 
Alto, CA 

SPONSORING ORG.: Department of veterans Affairs, Research and Development 



(15), 810 Vermont Ave. N.W., Washington, D.C. 20420 united States of 
America 

DATES: 20071114 

summary: mental disorders; database; head injury; veterans 
The purpose of this archival data analysis and chart review study is to 
examine the relationship between traumatic brain injury (TBI) and 
co-existing psychiatric illness in veterans in the VA Local vistA database 
and/or the National Patient Care Database (NPCD) . The goal is to 
characterize the co-existing psychiatric illness in this population, to 
improve the planning and development of targeted programs for early 
detection andintervention of co-existing psychiatric illness in this 
patient populat ion. The aims of the proposed data analysis study are to: 
(1) evaluate the sociodemographic characteristics and VA health service 
utilization of veterans who have sustained a documented TBI as reflected by 
VA medical records; (2) determine the prevalence rate of co - occurring 
psychiatricdiagnoses in veterans with TBI increase as a function by year. 
RESEARCH PLAN AND METHODS 

The proposed study will be a prospective cohort design and an archival 
data analysis, in the proposed study, we will analyze data from the VA 
Local VistA database and/or the National Patient Care Database (NPCD). 
Participants between the ages of 18 and 89 years, of any race or ethnicity, 
and who meet our inclusion and exclusion criteria, will be included in the 
study, in addition, we will include demographically matched veteran control 
participants. Primary data to be collected are brain injury-related 
and psychiatric diagnostic categories. Other primary data will include 
TBI clinical reminder questions. Descriptive statistics will be used to 
analyze sociodemographic variables, including gender, ethnicity, age, 
marital status, armed forces component, and health service utilization. 
Additional statistical analyses will be used as needed to compute secondary 
analyses (i.e., ANOVA, multiple regression). 

CLINICAL RELEVANCE 

Traumatic brain injury (tbi) is reported to be the most common 
consequence of combat-related injuries among surviving U.S. soldiers in the 
Operation Enduring Freedom (OEF) and Operation Iraqi Freedom (OIF) 
conflicts. The increasing number of TBI trauma survivors is a high priority, 
high cost area for the VA. Research also suggests that TBI may cause decade 
s-Tong or even permanent vulnerability to psychiatric illness in some 
individuals. The findings are critical to development of targeted programs 
for early identification and intervention for veterans who may have TBI 
and/or co-existing psychiatric illness. As the veteran population ages, 
disability as a consequence of TBI will become a significant health care 
issue in the coming decades in the treatment of older veterans (e.g., 
Vietnam, Korea, WW I era) in the VA. Given that the majority of TBI have 
been blast-related, the general izabi 1 ity beyond the VA is questionable. 

*** PDS Report: initial; Report Date: 11/14/07; Submitted: 11/19/07 *** 
initial Report 

DESCRIPTORS: MENTAL DISORDERS; DATABASE; HEAD INJURY; VETERANS 
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unified Psychogeriatric Biopsychosocial Evaluation and Treatment (upbeat) 

principal investigator: Blow, Frederic c, Ph.D. 

PERFORMING ORG.: Department of veterans Affairs, Medical Center, Loma 
Linda, CA 

SPONSORING ORG.: Department of veterans Affairs, Research and Development 
(15), 810 Vermont Ave. N.W., Washington, D.C. 20420 united States of 
America 

DATES: 19950215 

SUMMARY: GERIATRIC PSYCHIATRY; EVALUATION STUDIES; DEPRESSION; ANXIETY; 
ALCOHOLISM 

OBJECTIVES: The UPBEAT (unified Psychogeriatric Biopsychosocial Evalu 
ation and Treatment) program was established in 1994 to improve outcomesfor 
older veterans hospitalized for medical conditions and who also had 
comorbid depression, anxiety, or substance abuse. The program used an i 
ntervention focused on treating and managing the veteran's concurrent 
mental disorder. 

The UPBEAT program was initiated as a result of special Congressional 



legislation mandating the VA program, and was specifically designed to end 
in 2001. As a result, the Coordinating Center ceased to exist at theend of 
FY01. At the request of VA Headquarters, SMITREC has agreed to m aintain 
and utilize the upbeat data set . Additional irb applications will be 
filed as required when specific plans for analyses and manuscript 
preparations are made. 

RESEARCH DESIGN: Subjects were enrolled into the study at 9 VA Medica 1 
Centers nationwide. Patients were screened for depression, anxiety 
andsubstance abuse; those screening positive were randomized to interventi 
on or control conditions. Follow-up assessments were conducted at 6-, 12-, 
and 24-month intervals following baseline . 

METHODOLOGY: Data were coded and computerized at the west LA/UCLA Coo 
rdinating Center. All data cleaning and coordination has been the 
responsibility of the Coordinating Center, in addition, utilization data 
were extracted from the VA Administrative Data Sets at Austin and were 
mergedwith patient level data. Initial data analyses have been 
conducted, and several manuscripts have been published from the data set 

clinical relationships: Co - occurring physical and mental health 
disor ders are a critical issue to ensure best practices in the treatment 
of older veterans in the VHA system. This database will provide VA 
headquarters, policy makers, and clinicians with needed techniques and 
information regarding the most effective means to deal with a serious issue 
in thecare of this vulnerable older adult population. Per VA headquarters' 
re quest, SMITREC continues to maintain this database. 

10/22/03 FINDINGS: UPBEAT intervention appears to accelerate the tran 
sition from inpatient to outpatient care (shorter lengths of stay) for 
patients admitted to acute medical or surgical hospital services who 
haveundiagnosed psychiatric symptoms. The kind of case management or care c 
oordi nation that appeared to be most successful in UPBEAT is similar to 
that done for other high cost patients by hospital -based home care managers 
and mental health intensive case management teams. 

10/27/04 FINDINGS: UPBEAT intervention appears to accelerate the tran 
sition from inpatient to outpatient care (shorter lengths of stay) for 
patients admitted to acute medical or surgical hospital services who 
haveundiagnosed psychiatric symptoms. The kind of case management of care c 
oordi nation that appeared to be most successful in UPBEAT is similar to 
that done for other high cost patients by hospital -based home care managers 
and mental health intensive case management teams. 

Two manuscripts have been submitted for publication and accepted, but 
have not yet been published: 1) Jarvik L, Gerson S, Maxwell A, Blow FC, et 
al . Symptoms of depression and anxiety (MHl) following acute medical 
/surgical hospitalization and post-discharge psychiatric (DSM) in 839 
geriatric US veterans, int 3 of Geriatric Psychiatry (in press); 2) Oslin 
DW, Thompson R, Kalian MJ , TenHave T, Blow FC, Bastani R, Gould RL, Maxwell 
AE, Darvik L. Treatment effects from UPBEAT: A randomized trial of care 
management for behavioral health problems in hospitalized elderly patients. 
D Geriatr Psychiatr Neurol (in press). 

DESCRIPTORS: GERIATRIC PSYCHIATRY; EVALUATION STUDIES; DEPRESSION; 
ANXIETY; ALCOHOLISM 
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Anxiety Symptoms in Schizophrenia: Association with Function, Self -Esteem 
and Symptoms 

PRINCIPAL INVESTIGATOR: Lysaker, Paul, Ph.D. 

PERFORMING ORG.: Department of Veterans Affairs, Medical Center, 
Indianapolis, IN 

SPONSORING ORG.: Department of Veterans Affairs, Research and Development 
(15), 810 Vermont Ave. N.W., Washington, D.C. 20420 united States of 
America 

DATES: 20040319 

SUMMARY: SELF CONCEPT; SCHIZOPHRENIA; ASSOCIATION; ANXIETY 
OBJECTIVES: The proposed study seeks to gather data on the 
co-occurrenceof anxiety symptoms in schizophrenia with the primary long 
term goal of gathering pilot data to support a grant proposal for the 
development of a federally funded cognitive behavior therapy intervention 



that might u niquely target anxiety symptoms in this group, 
initial OBJECTIVES: 

1 To determine the frequency with which persons with schizophrenia 
experience significant levels of co - occurring anxiety symptoms 
including obsessions and compulsions, social anxiety and symptoms linked 
with trauma. 

2 To determine whether co - occurring symptoms of anxiety are linked 
with deficits in working memory and executive function. 

3 To determine whether co - occurring symptoms of anxiety are linked 
with lower levels of self- esteem, social function, and more avoidant 
coping as well as experiences of stigma. 

4 To compare the severity of anxiety symptoms of persons with 
schizophrenia with those suffering from Post Traumatic Stress Disorder. 

5 To determine the stability of anxiety symptoms and their correlates 
over a period of six months. 

Long-term Objectives: 

6 To gather pilot data on the need and possible targets for a Cognitive 
Behavior Therapy (CBT) intervention designed to reduce anxiety symptoms in 
schizophrenia. 

7 To determine effect sizes necessary for a power analyses to determine 
sample sizes for a study of the effects of CBT on anxiety symptoms in s 
hizophrenia. RESEARCH PLAN: Recruitment of 90 participants will begin upon 
approval. By enrolling a minimum of 35 participants per month we anticipate 
compl ting all initial assessments within four months, we expect to 
complete 11 follow-up assessments and all analyses within one year of 
receiving ull approval for the protocol, we intend to apply for funding 
within on year of receiving full approval, we anticipate having completed 
and sub mitted all manuscripts from this study within 18 months of 
receiving st dy approval, methodology: Ninety participants will be 
recruited from the psychiatry ervice of a Community Mental Health Center 
and VA Medical Center. To qu lify for the study, participants must have a 
SCID confirmed diagnosis o schizophrenia or schizoaffective disorder and 
will be in a post acute s tage of illness as defined by no 
hospitalizations, changes in type of p ychotropic medication or in housing 
within the previous 30 days. Parti c pants will be a minimum of 18 years of 
age. Exclusion criteria will inc ude active substance use, or history of 
mental retardation diagnosis. F r comparison purposes an additional 25 
persons with Posttraumatic Stres Disorder will be recruited from the PTSD 
program of the VA Medical Cent er. Exclusion criteria for these 
participants will include a history of mental retardation or a diagnosis of 
a psychotic disorder or active sub tance abuse. To be eligible all PTSD 
participants must have had no hosp talizations, changes in type of 
psychotropic medication or in housing w thin the previous 30 days. 
Following informed consent and confirmation of eligibility, participant 
will undergo an initial assessment battery including measures of neuroc 
ognition, symptoms and function. Participants will be invited to return in 
6 months for a reassessment, including all initial assessment proced res 
except neurocognitive assessment. In addition, at reassessment part cipants 
will be asked only about trauma experiences over the preceding ix months. 

Primary data analysis will include factor and cluster analyses to 
deter ine grouping of anxiety symptoms in the schizophrenia group. MANOVA 
and ANOVA procedures will be used to compare PTSD and schizophrenia parti ci 
ants on key measures. Multiple regression procedures will be utilized t 
examine links between level of anxiety symptoms and positive and negati ve 
symptoms, awareness of illness, neurocognition and psychosocial func ion. 
RESULTS: Most recently, with the use of federal funding 

DESCRIPTORS: SELF CONCEPT; SCHIZOPHRENIA; ASSOCIATION; ANXIETY 
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Maintenance of the UPBEAT data Set 

PRINCIPAL INVESTIGATOR: Blow, Frederic C. , Ph.D. 

PERFORMING ORG.: Department of veterans Affairs, Medical Center, Ann 
Arbor, Ml 

SPONSORING ORG.: Department of veterans Affairs, Research and Development 
(15), 810 Vermont Ave. N.W., Washington, D.C. 20420 united States of 
America 



DATES: 20010809 

SUMMARY : SUBSTANCE-RELATED DISORDERS; MENTAL DISORDERS; ADULT 
OBJECTIVES: The UPBEAT (unified Psychogeriatri c Biopsychosocial 
Evaluation and Treatment) program was established in 1994 to improve 
outcomes for older veterans hospitalized for medical conditions and who 
also had comorbid depression, anxiety, or substance abuse. The program used 
an intervention focused on treating and managing the veteran's concurrent 
mental disorder. RESEARCH PLAN AND METHODOLOGY: Subjects were enrolled into 
the study at 9 VA Medical Centers nationwide. Patients were screened for 
depression, anxiety and substance abuse; those screening positive were 
randomized to intervention or control conditions. Follow-up assessments 
were conducted at 6-, 12-, and 24-month intervals following baseline. 1377 
subjects were randomized to the UPBEAT intervention and 1339 were included 
in the control group. Prior to enrollment, individual subjects signed 
informed consent. Data were coded and computerized at the west LA/UCLA 
coordinating Center. All data cleaning and coordination has been the 
responsibility of the Coordinating Center, in addition, utilization data 
were extracted from the VA Administrative Data Sets at Austin and were 
merged with patient level data. Initial data analyses have been 
conducted, and several manuscripts have been published from the data set 
. The UPBEAT program was initiated as a result of special congressional 
legislation mandating the VA program, and was specifically designed to end 
in 2001. As a result, the Coordinating Center ceased to exist at the end of 
FYO I. At the request of VA Headquarters; SMITREC has agreed to maintain 
and utilize the upbeat data set and will continue to do so 
indefinitely. Additional IRB applications will be filed as required when 
specific plans for analyses and manuscript preparations are made. CLINICAL 
relevance: Co - occurring physical and mental health disorders are a 
critical issue to ensure best practices in the treatment of older veterans 
in the VHA system. This database provides headquarters, policy makers, and 
clinicians with needed techniques and information regarding treatment 
interventions to deal with a serious issue in the care of this vulnerable 
older adult population. 08-12-2003 8/14/03 Is 

Co - occurring physical and mental health disorders are a critical 
issue to ensure best practices in the treatment of older veterans in the 
VHA system. This database will provide VA headquarters, policy makers, and 
clinicians with needed techniques and information regarding the most 
effective means to deal with serious issue in the care of this vulnerable 
older adult population. Per VA headquarters' request, SMITREC continues to 
maintain this database. 

Update 7/7/05: SMITREC continues to maintain the database at request of 
VA Central Office. There is no progress to report, 
al 

*** PDS Report: Progress; Report Date: 08/09/06; Submitted: 08/21/06 
***There is no progress to report at this time although future analyses 
areplanned. 

al 

*** PDS Report: Final; Report Date: 07/18/07; Submitted: 07/18/07 *** 
FINAL REPORT 7/16/2007: No additional analyses have occurred in the past 
year, nor have there been any new publications from this data. The project 
is being terminated. 

DESCRIPTORS: SUBSTANCE-RELATED DISORDERS; MENTAL DISORDERS; ADULT 
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PUBLICATION DATE: 1981 

DOCUMENT TYPE: Journal Article 
RECORD TYPE: Abstract 
LANGUAGE: English 
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Abb I KALI : 

A study was carried out of the relationship between the vocabulary of 
user queries and the vocabulary of documents relevant to thequeries, and 
the value of adding to the document description record in a retrieval 
system keywords from previous queries for which the document had proved 
useful. Two test databases incorporating user query keywords were 
implemented at the School of Library and Information Science, university 
of western Ontario. Clustering of the documents via title and user 
keywords, a statistical analysis of title-user keyword co - occurrences , 
and retrieval tests were used to examine the effect of the added keywords. 
Results showed the impracticality of the procedure in an operational 
setting, but indicated the value of analyses with sample data in the 
development and maintenance of keyword dictionaries and thesauri. 



descriptors: Bibliographic retrieval; Data base ; Clustering ; Keyword 
information 
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Purpose. This paper describes the design, development, and partial 
implementation of the TUIT (Technology utilized for investigation and 
Teaching) system, a computer-based system for research and teaching in the 
field of literature. 

The TUIT system at the present time consists of a basic calling 
and parameter-analyzing main program which links together four separate 
subprograms. The subprograms are designed to: modify and/or create files in 
a standardized format; index the file, either through KWIC or KWOC 
procedures, and provide an indication of the frequency of appearance of 
words in the text; count the number of words, sentences , and 
syllables in the text; and calculate several different readability indices 
for the textual material. Several additional programs and changes to 
existing programs are projected for the future development of the system. 

TUIT, and other computer-based text handling systems, can be of 
value not only to the researcher, but also to the instructor, through the 
ability to provide consistent and comparable data about a single text or 
group of texts. The advantages to be gained--in terms of ease of handling 
the material, consistency of computations, and speed and accuracy in 
manipulating the text--make the required initial effort to convert the 
material to machine- readable form well worthwhile. 

Appendices describing sample output from the system and providing 
both technical and non-technical guides to the use of the system are also 
included. 
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Country of Publication: united States 

A computerized program for assessing the readability of technical 
documentation is presented. This program is particularly useful to Army 
personnel responsible for the readability of Army publications. The program 
is designed to provide the user with an analysis of the text that includes: 
(a) the complete text, (b) a listing of words containing 3 or more 
syllables and the number of times each multi-syllable word appears in 
the text, (c) the number of sentences , (d) the average sentence length, 
(e) the number of words, (f) the number of syllables, (g) the average 
syllables per word and (h) the Flesch-Kincaid reading grade level score. An 
appendix provides the reader with both a complete program listing (BASIC) 



and sample input and output files. 

Descriptors: -'Reading; -'Technical writing; -Human factors engineering; 

Documents; Military publications; Computer programs; Comprehension; writing 

; Literacy; Automatic; Scoring; Army personnel; word lists; Output; input; 

Files(Records) ; Quantity; Syllables; Length 

identifiers: Readability; Flesch reading ease formula; NTISDODXA 
Section Headings: 88GE (Library and Information Sciences--General) 
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Searching far and wide: the powerful document retrieval software of PLS. 
(Personal Library Software) (Company Business and Marketing) 
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ISSN: 0889-9762 LANGUAGE: English RECORD TYPE: Fulltext 

WORD COUNT: 4746 LINE COUNT: 00379 

of a relevance score for each document are variables such as: 

* The number of times each query term is found in a document; 

* The number of different search terms that appear in a document; 

* how close the 'hit' terms found are to the beginning of each 
document ; 

* How closely together different search terms appear in the text; 

* How closely the order in... 

...the terms matching those in the query, as indicated by frequency of 
appearance in the document collection . (Rarer terms are more useful in 
indicating what a document is about, and receive a... 



13/3, K/2 (Item 2 from file: 275) 

DIALOG (R) Ft 1 e 275: Gale Group Computer DB(TM) 
(c) 2008 The Gale Group. All rts. reserv. 

01832377 SUPPLIER NUMBER: 17378357 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

What to do with documents. (Speed Reading: Optical Character Recognition 
Software) (document-management software) (Software Review) (Evaluation) 

Computer Shopper, vl5, nlO, p529(l) 
Oct, 1995 
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more power than document managers for finding information in word 
processing, spreadsheet, and database files. 

Text -retrieval programs are easier to set up and use because they 
handle fewer types of information and don't interact with... 

...can start scanning and OCR from within document-management programs, you 
handle those tasks before starting a text -retrieval package, saving 
pages as text files. 

The text- retrieval software then indexes those files, noting all the 
words it finds and the documents in which they appear, if you then 
search for, say, documents containing... 
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the letters appeared in reverse type. 
Setting up a search. Above: This is how to set up a search for all 
articles containing the keyword "aids." Right: This page was found to 
contain the keyword "aids." This time, Newsware found the keyword in the 
headline, but. . . 



...This color image is linked to a monochrome image created at the time of 
the initial page scan, in the initial implementation of Newsware, which 
is currently being delivered, color images are... 
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Grammar checkers apply polish to writing's rough edges. (Software Review) 
(Rightsoft inc's Rightwriter, Lifetree Software inc's Correct Grammar and 
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related article on Rightwriter 's lack of an interactive user interface) 
(evaluation) 

Enyart, Bob; Erickson, Michelle; Webster, Steven; Frentzen, Jeffrey 
PC week, v6, n35, p35(3) 
Sept 4, 1989 

DOCUMENT type: evaluation ISSN: 0740-1604 LANGUAGE: ENGLISH 
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WORD COUNT: 2265 LINE COUNT: 00186 

detected and various readability indexes, including the 
education-level rating. 

it also delivers the average number of sentences per paragraph, 
words per sentence and syllables per word. It presents the number of 
words , passive verbs, prepositions, question marks and exclamation points 
used in the document. 

Grammatik III is. . . 
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Services. TRW builds credit risk models using bureau information. 
Behavior Models Formerly Looked just at Master File information 
in behavior modeling, what typically occurred in the past was the use 
of master file information for account monitoring activities--such as 
credit line increases or decreases, authorizations and collections. 

Behavior modeling has been effective for delinquencies of 30 and 60 
days using only master file information , Esquinas said. But farther 
along, at 90, 120 or 180 days past due, looking at master file data was 
of limited value. 

Collectible and non-collectible accounts both appear the same in 
terms of various characteristics from the master file-- balance, how long 
past due, how long someone... 

...everybody begins to look homogeneous," Esquinas said. 

The reliance on bureau information, in addition to master file data 
, was due to the frustration of collection agencies, financial institutions 
and retailers. The collection environment traditionally has offered a 
shotgun approach, Esquinas emphasized. At the point when the collection 
agency receives information from the financial institution, the account 
has been charged off. 

Distinguishing accounts as to collectibility... 
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Communication in Latin America: an analysis of Guatemalan business letters. 
Conaway, Roger N.; wardrope, William l . 
Business Communication Quarterly, 67, 4, 465(10) 
Dec, 2004 

ISSN: 1080-5699 LANGUAGE: English RECORD TYPE: Full text 

WORD COUNT: 3769 LINE COUNT: 00339 

researchers independently translated the letters and agreed on the 
substantive content when translation of difficult words appeared . After 
translations to English were completed, each researcher independently 
counted the number of sentences in each letter sequentially and 
identified the sentence he thought stated the writer's central... 



13/3, K/7 (Item 2 from file: 148) 

DIALOG (R) Fi 1 e 148: Gale Group Trade & industry DB 
(c)2008 The Gale Group. All rts. reserv. 

09841124 SUPPLIER NUMBER: 19781070 (USE FORMAT 7 OR 9 FOR FULL TEXT) 
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Tolar, Martin Michael 

Journal of Socio-Economi cs , v26, n2, p291(12) 
May-Dune, 1997 

ISSN: 1053-5357 LANGUAGE: English RECORD TYPE: Fulltext; Abstract 
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interest rates and supposes that individuals are capable of making 
distinctions between economic variables in both nominal and real terms . 
Secondly real interest rates appear to be highly variable in the 
Australian experience over time, (ILLUSTRATION FOR FIGURE 2 OMITTED... 

...and the methodologies employed in past examinations call into question 
the REPIH . 

DATA AND METHODOLOGY 

Data 

Our investigation involves the collection and analysis of both 
primary and secondary data . 
Primary Data 

we employ the use of survey data which is obtained from the 
administration of a... 
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include: 

* Providing a central system function to organize, maintain and 
control access to the data base of electronic documents . For example, 
access to personnel records needs to be tightly controlled while corporate 
marketing strategies... 

...distribution services, versioning services, etc. 

* Providing several methods to intelligently access information -- 
full text retrieval ( find all documents that contain these words or 
phrases), keyword indexing ( find all ...for document management systems 
is growing at more than 35 per cent a year. Gartner Group estimates the 
worldwide market for document management software and integration will 
grow (excludes hardware) to $2 billion by 1997 as the... 
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Differential patterns of textual characteristics and company performance in 
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WORD COUNT: 8671 

...TEXT: on equity (ROE), when analysing syntactical characteristics such 
as word count, syllables per word and words per sentence , they found 
that only word count statistically significant; high ROE firms were more 
verbose than low ROE firms. However, it is... 
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...abstract: track of trec-12. The first method consists of asking the user 
to select a number of sentences that represent documents. The second 
method consists of showing to the user a list of noun phrases extracted 
from the initial document set . Both methods then expand the query 
based on the user feedback. The TREC results show... 
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deterrence versus brutalization: capital punishment's DIFFERING IMPACTS 

AMONG STATES 

Shepherd, Joanna M 

Michigan Law Review vl04n2 PP: 203-255 Nov 2005 
ISSN: 0026-2234 JRNL CODE: MLW 
WORD COUNT: 18137 

...TEXT: studies of capital punishment's deterrent effect and other capital 
punishment researchers have used similar data sets . I am now, for the 
first time, using the data to estimate separate deterrent effects for 
individual states. 

we should expect some differences among the data sets in the states 
that fall into each group: deterrent, no effect, and brutalization. in some 



...statistically significant effect that state-level data did not. The 
varying time periods of the data sets may also result in differences if 
states experienced deterrence or brutalization during some years, but not 
others. Nevertheless, the results from the other data sets can support 
the primary data set 's evidence that capital punishment has different 
impacts in different states. 

A. State-Level Monthly... 

...and other variables at the state level over the period 1977-1999. I used 
this data set in another recently published study in The Journal of 
Legal Studies. Because the data and... 

...death row sentence in a given month is defined as a moving average of 
the number of death row sentences in the current and previous eleven 
months divided by a similar twelve-month moving average... 
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How effectively do marketing journals transfer useful learning from 
scholars to practitioners? 

Crosier, Keith 
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WORD COUNT: 6887 

...TEXT: remainder of the document?" Click the No button. 

A new dialogue box, Readability Statistics, will appear automatically. 

it provides: count of words , characters, paragraphs and sentences ; 
averages of sentences per paragraph, words per sentence and characters per 
word; a percentage figure... 
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An evaluation of help mechanisms in natural language information retrieval 

systems 

Kreymer, Oleg 

Online information Review v26nl PP: 30-39 2002 
ISSN: 1468-4527 JRNL CODE: ONCD 
WORD COUNT: 5578 

...TEXT: Sometimes a system, claiming to use, for instance, "syntactic 
parsing" (i.e. analysis of the sentence structure), would end up 
counting the appearance of query terms in a document. As a result, 
retrieved documents were irrelevant to a search. This would... 
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Cutting costs or raising revenue? 

Hollman, Lee 
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. . .TEXT: employees . 

Starting at $10,000, the software's price varies with the number of 
knowledge base articles you create. The starting price entitles you to 
500 articles. For approximately $25,000, you... 

...frames and JavaScript complicate keyword searches. So the spider 
includes natural language processing capabilities to find appropriate 
content from each site. 

Customers can enter keywords and full sentences, and One Step can 
recognize typos without requiring you to save a... 
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Managed in Hong Kong: Adaptive Systems, Entrepreneurship and Human 
Resources 

Selmer, Jan 

ASEAN Economic Bulletin vl8n2 PP: 247-249 Aug 2001 
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...TEXT: western connotations. Based on original empirical research, the 
author compares the extent of organizational commitment, both in 
behavioural and attitudinal terms . As expected, she finds that the 
Chinese exhibited a lower level on both counts than their western 
counterparts, but. . . 

...stores, Yaohan, has closed its doors for the last time. Nevertheless, 
the chapter provides interesting primary data collected through 
in-depth interviews of both Japanese expatriates and local employees of 
Yaohan and Jusco. . . 
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Anonymous 
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...TEXT: checking; search trees, result ranking and best match searching; 
and links to thesauri and related word strings generated by co - 
occurrence rankings. 

Ray R. Larson, School of Library and information studies, university of 
California at Berkeley .. .overload. The Cheshire II system includes the 
following design features: * it supports SGML as the primary data base 
format of the underlying search engine; it is a client/server application 
where the. . . 
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...TEXT: pro-active philosophy, however, is often more productive. This 
means setting up meetings with concerned groups and sending out 
preliminary information on the plans of the site and the benefits of 
cellular service to the public... 

...consultants. Because it is a meeting of record and many concerned 
residents also may be present , every word that is spoken by the 
company representative will be analyzed and interpreted. Having standard 
rehearsed. . . 



16/3, K/l (item 1 from file: 16) 

DIALOG (R) File 16: Gale Group PROMT(r) 
(c) 2008 The Gale Group. All rts. reserv. 

13909820 Supplier Number: 160640487 (USE FORMAT 7 FOR FULLTEXT) 
Pathway research in plants. (instruments & Systems: Bioinformatics Focus) 

Bioscience Technology, v32 , n2, p24(l) 
Feb, 2007 

Language: English Record Type: Fulltext 
Document Type: Magazine/Journal; Trade 
word Count: 105 

(USE FORMAT 7 FOR FULLTEXT) 
TEXT: 

...000 plant-specific abstracts and four full-text plant research journals, 
in addition, 382,000- sentence co - occurrence facts for plant proteins 
are available. 
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de nous; mais aussi nous n'aimons, nous n'embrassons rien de 
reel . [12] 

This sentence was located in the artfl database with a co - 
occurrence search for the patterns "idE[aeo] ,*", "rE[ea]l.*" and "quant." 
I specifically wanted to... 
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...ABSTRACT: words in a sentence. The second method uses proximity 
relationships, particularly information about the ordered co - occurrence 
of words in a sentence , to approximate the dependency relationships 
between them. A Structured index has been constructed for these... 
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...ABSTRACT: of our experiments for weighted retrieval is the surprising 
result that features that describe the co - occurrences of words in 
sentence -size or paragraph-size windows are significantly better 



descriptors than purely word-based indexing features. 
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